Запросы из унитестов Django к Postgres произвольно не выполняются, оставаясь в активном режиме

I'm developing ETLs with Django ORM and while running unittests either from my local env or from jenkins the query sometimes just stays active in the DB without being completed.
I'm not always able to recreate the issue.
I'm using Django 3. 1.7 and the test are running within a Django "TestCase", so each test is running within a transaction.
Im using a google cloud hosted Postgres (Postgres 12) DB server.
When I run the ETL not in unittests everything runs smoothly. When run the generated query separately it work just fine and runs very fast.

  • models.py definition for the tables:
class DiseaseTarget(models.Model):
    id = models.AutoField(primary_key=True, db_index=True)
    gene = models.ForeignKey(Gene, on_delete=models.CASCADE, related_name='disease_target_gene', db_index=True)
    disease = models.ForeignKey(Disease, on_delete=models.CASCADE, related_name='disease_target_disease', db_index=True)
    dev_stage = models.TextField()

    class Meta:
        managed = True
        db_table = 'disease_target'
        unique_together = (('gene', 'disease'),)

class Gene(models.Model):
    entity = models.OneToOneField(Entity, on_delete=models.CASCADE, primary_key=True, related_name='genes', db_index=True)
    symbol = models.TextField(null=True, db_index=True)

    class Meta:
        managed = True
        db_table = 'gene'

class Entity(models.Model):
    id = models.AutoField(primary_key=True, db_index=True)
    entity_id = models.TextField(db_index=True)
    TYPES = [
        ('cellnode', 'cell'),
        ('gene', 'gene'),
        ('geneset', 'geneset')
    ]
    type = models.TextField(choices=TYPES, db_index=True)
    name = models.TextField(default=entity_id)
    description = models.TextField(blank=True, null=True)


    class Meta:
        managed = True
        db_table = 'entity'
        index_together = [
            ("id", "type"),
        ]
        unique_together = (('entity_id', 'type'),)
  • etl code:
class EtlDiseaseTarget(AbstractDataMartEtl):
    def __init__(self, database: str, data_source: str, nrows=None):
        self.database = database
        self.data_columns = ['disease_name', 'ENTREZID', 'dev_stage_category', 'disease_ontology']
        self.data_source = data_source
        self.data = self.extract(input_sep='\t', nrows=nrows)
        self.transformed_data = {'disease_target': None}

    def transform(self):
        ...
        existing_entries = DiseaseTarget.objects.using(self.database).select_related('gene__entity').filter(gene__entity__entity_id__in=self.data.ENTREZID.unique())
        ...
  • db query generated:
    SELECT "disease_target"."id", "disease_target"."gene_id", "disease_target"."disease_id", "disease_target"."dev_stage", "gene"."entity_id", "gene"."symbol", "entity"."id", "entity"."entity_id", "entity"."type", "entity"."name", "entity"."description", "entity"."created_at", "entity"."updated_at" FROM "disease_target" INNER JOIN "gene" ON ("disease_target"."gene_id" = "gene"."entity_id") INNER JOIN "entity" ON ("gene"."entity_id" = "entity"."id") WHERE "entity"."entity_id" IN ('7124', '1435', '1437', '3594', '3595', '51561', '149233', '3695', '8174', '3676') 
  • unittest code:
class NoneGroundTruthTest(TestCase):
    def test_disease_target(self):
        print("\t===========test_disease_target=================")
        test_disease_etl = EtlDiseaseTarget(data_source=self.disease_target_path,
                                            database=self.database,
                                            nrows=10)
        test_disease_etl.transform()
        test_disease_etl.load()
        self.assertGreater(
           len(list(DiseaseTarget.objects.using(self.database).all()[:10])), 0)
  • DB settings from settings.py:
 DATABASES = {
    'default': {
        'ENGINE': 'django.db.backends.postgresql_psycopg2',
        'NAME': os.getenv('db_name_ccm'),
        'HOST': os.getenv('db_host_ccm'),
        'PORT': os.getenv('POSTGRES_PORT'),
        'USER': os.getenv('DB_USER'),
        'PASSWORD': os.getenv('DB_SECRET_PASSWORD'),
    },
}

Я не знаю, как решить эту проблему, и она очень мешает моему процессу разработки. Буду рад получить от вас любую информацию. Спасибо!

Вернуться на верх