Django Raw SQL Query using multiple databases

I am trying to pull information from 2 different databases in the same query.

Below is an example of the code I'm trying to use with identifying information changed to generic names.

I have my data bases set up in settings like this:

DATABASES = {
 'default': {},
 'primary': {
     'ENGINE': 'django.db.backends.oracle',
     'NAME': 'oracle.test.com:5432/erp',
     'USER':'test1',
     'PASSWORD':'',
 },
 'secondary': {
     'ENGINE': 'django.db.backends.oracle',
     'NAME': 'oracle2.test.com:5432/wms',
     'USER':'test2',
     'PASSWORD':'',
 },
}

Then in my views I have this:

def index(request):
    sql_query = ('select \n'
             '  primary.table1.organization_id "Org", \n'
             '  primary.table2.LOCATION "Loc", \n'
             '  primary.table3.inventory "Inv", \n'
             '  primary.table4.reorder_date "Reorder", \n'
             '  secondary.table3.COMMENTS "Comments", \n'
             'from \n'
               '  primary.table2, primary.table1\n'
               '  LEFT OUTER JOIN primary.table3 On\n'
               '    (primary.table1.SCHEDULE_NUMBER = primary.table3.SCHEDULE_NUMBER) \n'
               '  LEFT OUTER JOIN primary.table4 On \n'
               '    (primary.table1.SCHEDULE_NUMBER = primary.table4.PARENT_SCHEDULE_NUMBER) \n'
               '  LEFT OUTER JOIN secondary.table1 On \n'
               '    (primary.table1.SCHEDULE_NUMBER = primary.table1.SCHEDULE_NUMBER) \n'
             'where
               'primary.table1.item_id = primary.table2.inventory_item and \n'
               'primary.table1.organization_id = primary.table2.organization_id \n')
    with connections['primary', 'secondary'].cursor() as cursor:
        cursor.execute(sql_query)
        field_names = [tuple(x[0] for x in cursor.description)]
        row = cursor.fetchall()
        result = field_names + row
        df = pd.DataFrame(result)
        df.rename(columns=df.iloc[0], inplace=True)
        df.drop([0], inplace=True)
        table = df.to_html(index=False, classes='mystyle', justify='left')
    return render(request, 'template.html', {'table': table})

I thought this would work because the syntax I found online for SQL says to prepend the database name to each table. Unfortunately, I get a database error saying that primary or secondary is an invalid identifier.

First, connections is a dict so you should iterate on its keys to get both connections:

dbs_to_connect = ['primary','secondary']
for k in dbs:
    with connections[k].cursor() as cursor:
        # code here...

Second, I'm not sure how your dbs and schemas are configured in Oracle, but remember that the query should be always select <colums> from <schema>.<table>. Regardless, the query string should run directly in another DB software as well, so you can test it there to make sure the problem is not the query itself.

Based on some advice from other forums. I learned that it is not possible to join tables across databases. So here is my solution. Run the 2 databases as 2 different queries and then store the outputs in separate dataframes. Then use pandas merge to join the 2 dataframes together with a left join.

    Filepath = 'example filepath'
    SQL_Query1 = 'example'
    SQL_Query2 = 'example'
    with connections['primary'].cursor() as cursor:
        cursor.execute(SQL_Query1)
        field_names = [tuple(x[0] for x in cursor.description)]
        row = cursor.fetchall()
        result = field_names + row
        df = pd.DataFrame(result)
        df.rename(columns=df.iloc[0], inplace=True)
        df.drop([0], inplace=True)
        df['SCHED_#'] = pd.to_numeric(df['SCHED_#'])

    with connections['secondary'].cursor() as cursor2:
        cursor2.execute(SQL_Query2)
        wms_field_names = [tuple(x[0] for x in cursor2.description)]
        wms_row = cursor2.fetchall()
        wms_result = wms_field_names + wms_row
        wms_df = pd.DataFrame(wms_result)
        wms_df.rename(columns=wms_df.iloc[0], inplace=True)
        wms_df.drop([0], inplace=True)
        wms_df['SCHED_#'] = pd.to_numeric(wms_df['SCHED_#'])

    merged_df = pd.merge(df, wms_df, on='SCHED_#', how='left', indicator=True)
    merged_df.to_csv(filepath, index=False)

Thanks everyone for your input

Back to Top