About Concurrent.futures()

Table of Contents

AsyncIO - This article is part of a series.

Part 2: Understanding AsyncIO by Examples

Part 6: This Article

First thing first: concurrent.futures != asyncio

While they both deal with asynchronous programming, they employ different approaches and are suited to different types of tasks.

To be clear, they’re both limited by Global Interpreter Lock (GIL) and are both single process, multi-thread. They are both forms of concurrency but not parallelism.

I used to be using concurrent.futures until I learnt asyncio.

Using `concurrent.futures`
#

There are 2 ways to use concurrent.futures:

ThreadPoolExecutor - Best used for I/O-bound tasks (such as file reading and network request).
ProcessPoolExecutor - Suitable for CPU-intensive tasks (such as math computations).

Let’s see 2 examples here. Here’s the sample on ThreadPoolExecutor():

#!/usr/bin/env python3
# -*- coding: utf-8 -*-

from rich import print as rprint
from timeit import default_timer as timer
import concurrent.futures
import requests

def timeit(func):
    def timed(*args, **kwargs):
        stime = timer()
        result = func(*args, **kwargs)
        etime = timer()
        rprint(f'\n [*] {func.__name__}(): completed within [{etime-stime:.4f} sec].\n ')
        return result
    return timed

def fetch_url(url):
    resp = requests.get(url)
    return url, resp.status_code, resp.elapsed.total_seconds()

@timeit
def singlethread():
    for url in urls:
        _, status_code, elapsed = fetch_url(url)
        rprint(f"URL: {url} [ {status_code} / {elapsed:.4f} ]")

@timeit
def multithread():
    with concurrent.futures.ThreadPoolExecutor() as executor:
        futures = [executor.submit(fetch_url, url) for url in urls]

        for future in concurrent.futures.as_completed(futures):
            url, status_code, elapsed = future.result()
            rprint(f"URL: {url} [ {status_code} / {elapsed:.4f} ]")


if __name__ == "__main__":

    urls = [
        "https://www.bing.com",
        "https://www.duckduckgo.com",
        "https://www.google.com",
        "https://www.yahoo.com"
    ]

    print(f'')
    singlethread()
    multithread()

Here is the output:

Here’s the sample on PorcessPoolExecutor():

#!/usr/bin/env python3
# -*- coding: utf-8 -*-

from rich import print as rprint
from timeit import default_timer as timer
import concurrent.futures
import time


def timeit(func):
    def timed(*args, **kwargs):
        stime = timer()
        result = func(*args, **kwargs)
        etime = timer()
        rprint(f'\n [*] {func.__name__}(): completed within [{etime-stime:.4f} sec].\n ')
        return result
    return timed


def my_function(x):
    time.sleep(2)
    return x, x * x

@timeit
def without_executor(numbers):
    results = []
    for num in numbers:
        start_time = time.time()
        x, result = my_function(num)
        end_time = time.time()
        print(f"Without_executor: {x} squared = {result} [ {end_time - start_time:.2f} sec ]")
        results.append(result)
    return results

@timeit
def with_executor(numbers):
    with concurrent.futures.ProcessPoolExecutor() as executor:
        results = [executor.submit(my_function, num) for num in numbers]

        for f in concurrent.futures.as_completed(results):
            x, result = f.result()
            print(f"With_executor: {x} squared = {result} ")
    return results



if __name__ == "__main__":
    numbers = [2, 3, 4]

    print(f'')
    without_executor(numbers)
    with_executor(numbers)

Here is the output:

Key Differences
#

Feature	`concurrent.futures`	`asyncio`
Underlying mechanism	Threads/Processes	Event-loop
Best suited	CPU-bounded tasks	I/O-bound tasks
Complexity	Simpler	More complex
Control over execution	less granular	more granular

concurrent.futures
#

Uses threads or processes to execute tasks concurrently.
Better suited for CPU-bound tasks by leveraging multiple cores.
Simpler to use than asyncio for basic concurrent operations.

asyncio
#

Uses an event-loop to manage asynchronous tasks.
Execellent for I/O-bound tasks like network requests, file operations, etc.
Offer fine-grained control over task scheduling and execution.
Requires a deeper understanding of asynchronous programming concepts.

Summary
#

concurrent.futures is like having multiple workers to handle tasks independently. If you have multiple CPU-intensive tasks that can benefits from parallel execution, use concurrent.futures.

it uses time-slicing model by allocates slot of CPU time to all threads..
with many blocking threads (for long period), it begins to degrade into polling.
Time-slicing is managed by the OS, giving the programmer less control over thread scheduling.

asyncio is like having a single worker that can efficiently juggle multiple tasks without blocking. If you have many I/O-bound tasks that need to be handled efficiently without blocking the main thread, use asyncio.

it uses an event-loop, and is more akin to push-notification model.
it works by waiting for it to announce it’s availability (not checking the threads).
Event-loop which based on polling/interrupt as core mechanism, continuous checks (pools) for I/O events or task completion.
This makes asyncio offers more efficient for I/O-bound tasks, as it avoids unnecessary CPU usage when tasks are waiting for I/O.

AsyncIO - This article is part of a series.

Part 1: Understanding AsyncIO by Code

Part 2: Understanding AsyncIO by Examples

Part 3: Learn AsyncIO by Practices

Part 6: This Article

Part 7: Fetching with Retries

Learn AsyncIO by Practices

2024-05-21 01:35·755 words·4 mins

Essential Posts async http python

Next, let’s practice AsyncIO with different models.

Understanding AsyncIO by Examples

2024-05-19 09:35·584 words·3 mins