Как ускорить цикл BeautifulSoup for loop

У меня есть цикл for, который анализирует 6 урлов, чтобы получить текст первого класса с "GARawf". Цикл работает, однако я заметил, что теперь загрузка страницы занимает около 9 секунд по сравнению с 1 секундой раньше. Поскольку я новичок в Django и BeautifulSoup, мне интересно, есть ли способ рефакторинга кода, чтобы он загружал представление быстрее.

views.py

# create list of cities
city_list = ["Toronto", "Montreal", "Calgary", "Edmonton", "Vancouver", "Quebec"]

# create price list
prices_list = []

# set origin for flight
origin = "Madrid"
#origin_urllib = urllib.parse.quote_plus(origin)

# set headers
headers = {
    "User-Agent": "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.119 Safari/537.36"
}

for i in city_list:

    # set destination for flight
    destination = i

    # set search query
    url = "https://google.com/search?q=" + origin + " to " + destination + " Google Flights"

    response = requests.get(url, headers=headers)

    soup = BeautifulSoup(response.text, 'lxml')

    # get price element
    prices = soup.find("span", attrs={"class": "GARawf"})
    if prices != None:
        prices_list.append(prices.text.strip())
    else:
        prices_list.append("Not Available")

Я бы использовал потоки для вызова функции, выполняющей запросы, чтобы запросы выполнялись одновременно. Вы захотите import concurrent.futures

Затем переместите список citylist в функцию threading, как показано ниже. Подробнее об использовании пулов потоков

    def get_prices(city):
       prices_list = []
    
       origin = "Madrid"
       #origin_urllib = urllib.parse.quote_plus(origin)
    
       headers = {
       "User-Agent": "Mozilla/5.0 (Windows NT 6.1; Win64; x64) 
       AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.119 
       Safari/537.36"
       }
       url = "https://google.com/search?q=" + origin + " to " + destination + " Google Flights"
    
       with requests.get(url, headers=headers) as response:
           soup = BeautifulSoup(response.text, 'lxml')
    
        # get price element
           prices = soup.find("span", attrs={"class": "GARawf"})
           if prices != None:
              prices_list.append(prices.text.strip())
           else:
              prices_list.append("Not Available")
       return prices_list
    
    
def run_threadPool():
   with concurrent.futures.ThreadPoolExecutor() as executor:
        city_list = ["Toronto", "Montreal", "Calgary", "Edmonton", "Vancouver", "Quebec"]
                            
         results = executor.map(get_prices, city_list)
            for result in results:
                #do what you need here....
t2 = time.perf_counter() 
        
print(f'Finished in {t2-t1} seconds')

Вернуться на верх

Последние вопросы и ответы

Django + allauth email first authentication

How do I use Django for another project?

How to persist multi-step form data between views in Django without committing to DB?

In new versions of django, after creating/applying migrations/after creating a super-user, the server starts up by itself

iOS/web Auth Client ID Handling for Google Sign In

Flaky Circle CI tests (django): ImportError: cannot import name "task" from "app.tasks" (unknown location)

KeyError 'email' for django-authtools UserCreationForm

Django check at runtime if code is executed under "runserver command" or not

Django REST project doesn’t detect apps inside the “apps” directory when running makemigrations

Cannot query "admin": Must be "ChatMessage" instance in Django

Как ускорить цикл BeautifulSoup for loop

Последние вопросы и ответы

Рекомендуемые записи по теме