how to call python function by getting mongob collection values

how to create document and collection in mongodb to make python code configuration. Get attribute name, datatype, function to be called from mongodb ?

mongodb collection sample example

   { attributes_names: "email", attributes_datype: "string", attributes_isNull="false", attributes_std_function = "email_valid" }
   { attributes_names: "address", attributes_datype: "string", attributes_isNull="false", attributes_std_function = "address_valid" }


Python script and function

def email_valid(df):

    df1 = df.withColumn(df.columns[0], regexp_replace(lower(df.columns[0]), "^a-zA-Z0-9@._-| ", ""))
    extract_expr = expr(
        "regexp_extract_all(emails, '(\w+([\.-]?\w+)*@\[A-Za-z-.]+([\.-]?\w+)*(\.\w{2,3})+)', 0)")
    df2 = df1.withColumn(df.columns[0], extract_expr) 

    return df2

How to get all the mongodb values in python script and call the function according to attribues.


To create MongoDB collection from a python script :

import pymongo
# connect to your mongodb client
client = pymongo.MongoClient(connection_url)

# connect to the database
db = client[database_name]

# get the collection
mycol = db[collection_name]

from bson import ObjectId
from random_object_id import generate

# create a sample dictionary for the collection data
mydict = { "_id": ObjectId(generate()),
           "attributes_names": "email", 
           "attributes_datype": "string", 
           "attributes_std_function" : "email_valid" }

# insert the dictionary into the collection

To insert multiple values in the MongoDB, use insert_many() instead of insert_one() and pass the list of dictionary to it. So your list of dictionary will look like this

mydict = [{ "_id": ObjectId(generate()),
           "attributes_names": "email", 
           "attributes_datype": "string", 
           "attributes_std_function" : "email_valid" },
           { "_id": ObjectId(generate()),
           "attributes_names": "email", 
           "attributes_datype": "string", 
           "attributes_std_function" : "email_valid" }]

To get all the data from MongoDB collection into python script :

data = list()
for x in mycol.find():

import pandas as pd
data = pd.json_normalize(data)

And then access the data as you access an element of a list of dictionaries:

value = data[0]["attributes_names"]