Interview Questions& Model Answers

Real questions. Real answers. Built from 20 years of actual hiring and being hired.

1,774

Total Questions

Technologies

Levels

Showing 50 questions · Python

Clear all filters

PY-BEG-006 How does try-except-finally work in Python? ▾

Python Core Python Beginner

3/10

Answer

'try' runs code that might fail. 'except' catches specific errors. 'finally' always runs regardless of whether an error occurred — used for cleanup.

Deep Explanation

The try block contains the risky code. If an exception occurs Python looks for a matching except clause. You can catch specific exception types (except ValueError) or use a bare except to catch everything (not recommended). The else clause (optional) runs only if no exception occurred. The finally clause always executes even if there was an exception or a return statement inside try — making it essential for releasing resources like file handles database connections or locks. Multiple except clauses can handle different exception types differently.

Real-World Example

In a database write operation: the try block executes the INSERT query the except block catches IntegrityError for duplicate keys and returns a meaningful error message the finally block always closes the database connection regardless of success or failure — preventing connection pool exhaustion.

⚠ Common Mistakes

Using a bare 'except:' that catches everything including KeyboardInterrupt and SystemExit making the program impossible to stop. Not closing resources in finally causing memory or connection leaks. Catching too broad an exception type and hiding real bugs.

🏭 Production Scenario

A production API server ran out of database connections after 6 hours because a developer forgot to close connections in a finally block. The try block opened a connection an exception occurred the connection was never closed and the pool was exhausted within hours under normal traffic.

Follow-up Questions

What is the difference between except Exception and bare except? When does finally NOT execute? How do context managers (with statement) relate to try-finally??

ID: PY-BEG-006 · Difficulty: 3/10 · Level: Beginner

PY-BEG-003 What are *args and **kwargs in Python functions? ▾

Python Core Python Beginner

3/10

Answer

*args collects extra positional arguments as a tuple. **kwargs collects extra keyword arguments as a dictionary. Both allow functions to accept a variable number of arguments.

Deep Explanation

When you define a function with *args any positional arguments beyond the explicitly defined ones are packed into a tuple called args. With **kwargs any keyword arguments not explicitly defined are packed into a dictionary called kwargs. The names args and kwargs are just convention — the * and ** operators are what matter. You can use *args and **kwargs together and you can also use them when calling functions to unpack sequences and dictionaries into arguments. This pattern is heavily used in decorators, class inheritance, and API wrappers.

Real-World Example

Django's class-based views use **kwargs extensively to pass URL parameters captured by the router into view methods. FastAPI uses *args and **kwargs in middleware to forward requests without knowing the exact signature of the next handler.

⚠ Common Mistakes

Confusing *args (tuple) with a list. Forgetting that *args must come before **kwargs in the function signature. Trying to access args by keyword or kwargs by position. Mutating args thinking it is a list.

🏭 Production Scenario

A logging decorator in a production Flask app broke when a new endpoint added a keyword argument. The fix was changing the decorator to use *args and **kwargs so it would transparently forward any arguments to the wrapped function without needing updates every time a new parameter was added.

Follow-up Questions

How does ** unpacking work when calling a function? Can you have both *args and explicit keyword arguments? How are *args and **kwargs used in class __init__ with inheritance??

ID: PY-BEG-003 · Difficulty: 3/10 · Level: Beginner

PY-INT-006 How does pytest work and what makes a good unit test in Python? ▾

Python Core Python Intermediate

4/10

Answer

pytest discovers and runs test functions automatically providing rich assertion introspection fixtures for dependency injection and parametrize for data-driven tests. A good unit test is fast isolated deterministic and tests one specific behavior.

Deep Explanation

pytest looks for files named test_*.py functions named test_* and classes named Test*. When an assert fails pytest shows you exactly what the actual and expected values were — no need for assertEqual(). Fixtures (@pytest.fixture) provide setup/teardown and dependency injection for tests — database connections temporary files mock objects. Parametrize (@pytest.mark.parametrize) runs the same test with multiple input/output combinations eliminating test duplication. Mocking with unittest.mock.patch replaces real dependencies with controlled fakes making tests fast and isolated. Good unit tests: test one behavior run in milliseconds do not hit databases/networks/file systems (mock these) are deterministic (same result every run) and fail with clear messages.

Real-World Example

A FastAPI endpoint test: the test uses a pytest fixture providing a TestClient (mock HTTP client) patches the database dependency with an in-memory mock uses parametrize to test valid/invalid/edge case inputs and has clear test names like test_create_user_returns_201_for_valid_input. Each test runs in under 5ms with no external dependencies.

⚠ Common Mistakes

Writing tests that test implementation details instead of behavior — tests should not break when you refactor internals. Not mocking external dependencies making tests slow and flaky. Using a single large test function that tests multiple behaviors (impossible to tell which behavior failed). Asserting too broadly (assert response is not None) or too narrowly (asserting on exact internal state).

🏭 Production Scenario

A Django e-commerce platform's test suite took 45 minutes to run because 800 tests were hitting the actual test database. Refactoring to use pytest fixtures with database mocking and factory_boy for test data generation reduced the suite to 3 minutes enabling CI to run on every commit.

Follow-up Questions

What is the difference between mocking and stubbing? How do you test async functions with pytest? What is property-based testing with Hypothesis??

ID: PY-INT-006 · Difficulty: 4/10 · Level: Intermediate

PY-JR-004 Can you explain how to implement a simple linear regression model using Python libraries like NumPy or scikit-learn? ▾

Python AI & Machine Learning Junior

4/10

Answer

You can implement linear regression in Python using scikit-learn by first importing the LinearRegression class, then fitting it with your input features and target variable. After training, you can use the model to make predictions with the predict method.

Deep Explanation

Linear regression is a fundamental machine learning algorithm used for predicting a continuous target variable based on one or more input features. In Python, you typically start by importing the necessary libraries such as NumPy and scikit-learn. After loading your dataset, you need to split it into features and the target variable. Using scikit-learn's LinearRegression, you create an instance of the model and call the fit method with your features and target variable. This process finds the best-fitting line by minimizing the least squares difference between the predicted and actual values. Finally, you can assess the model's performance using metrics like R-squared and mean squared error and make predictions with new data using the predict method. Edge cases to consider include multicollinearity, where inputs are highly correlated, potentially skewing results, or outliers that can disproportionately affect the model's performance.

Real-World Example

In a production scenario, a company might use linear regression to predict sales based on advertising spend across different channels. They would collect historical data on advertising budgets and corresponding sales figures. By fitting a linear regression model with scikit-learn, the data scientists would analyze how changes in advertising efforts affect sales outcomes, enabling the marketing team to optimize their strategies for better returns.

⚠ Common Mistakes

One common mistake is not normalizing or standardizing the input features, which can lead to biased coefficients, especially when the features are on different scales. Another mistake is ignoring the assumptions of linear regression, such as linearity and homoscedasticity, which can result in misleading interpretations of the model. Additionally, many developers forget to evaluate model performance on a test set, leading to overestimation of how well the model will perform with unseen data.

🏭 Production Scenario

In a recent project at a mid-sized e-commerce firm, we needed to forecast future sales based on past sales data and multiple advertising channels. Implementing linear regression allowed us to determine which channels were most effective. However, we faced challenges when some channels showed multicollinearity, impacting the reliability of our predictions. Understanding and correcting for this helped deliver more accurate forecasts to the marketing team.

Follow-up Questions

What are some assumptions made by linear regression? How would you handle multicollinearity in your model? Can you explain how you would evaluate the performance of your linear regression model? What would you do if your model showed signs of overfitting??

ID: PY-JR-004 · Difficulty: 4/10 · Level: Junior

PY-INT-005 How do you handle large files in Python without loading them entirely into memory? ▾

Python Core Python Intermediate

4/10

Answer

Use generators file iteration (files are iterators in Python) or chunk-based reading. Never use read() or readlines() on large files — they load the entire file into memory.

Deep Explanation

Python file objects are iterators — you can iterate over them line by line without loading the entire file. For binary files or files where line iteration is not appropriate use file.read(chunk_size) to read fixed-size chunks in a loop. For CSV files use csv.DictReader (which iterates lazily) or pandas with chunksize parameter (pd.read_csv('file.csv' chunksize=10000) returns an iterator of DataFrames). For JSON use ijson for streaming JSON parsing. The with statement ensures the file is properly closed. For very large files (100GB+) memory-mapped files (mmap module) allow treating file content as if it were in memory while the OS handles paging.

Real-World Example

A log analysis system needed to process 50GB daily log files to extract error counts. Using open(file).read() caused OOM crashes. Refactoring to iterate line by line (for line in file) reduced memory usage from 50GB to under 10MB while processing the same file.

⚠ Common Mistakes

Using file.readlines() which builds a complete list of all lines in memory. Using pd.read_csv() without chunksize on multi-GB files. Not closing files (always use with statement). Forgetting to handle encoding explicitly — defaulting to system encoding causes silent corruption on non-ASCII data.

🏭 Production Scenario

A production data pipeline at a logistics company was crashing nightly when processing a 30GB shipment data CSV. The fix used pandas chunked reading: processing 50000 rows at a time aggregating results and writing summaries — reducing peak memory from 45GB (crashing the server) to 2GB.

Follow-up Questions

What is the mmap module and when would you use it? How does ijson enable streaming JSON parsing? How do you process a large file in parallel in Python??

ID: PY-INT-005 · Difficulty: 4/10 · Level: Intermediate

PY-JR-005 Can you describe a time when you had to solve a problem in Python, and how you approached it? ▾

Python Behavioral & Soft Skills Junior

4/10

Answer

I once had an issue with a script that was processing data too slowly. To tackle it, I first identified the bottleneck using profiling tools, and then I optimized the algorithms and data structures to improve performance. This methodical approach helped me significantly reduce the processing time.

Deep Explanation

When faced with a performance issue in Python, it's essential to first diagnose the problem accurately. This can involve using profiling tools like cProfile to identify which parts of the code consume the most time or resources. Once the bottleneck is identified, optimizations can be made, such as choosing more efficient algorithms or data structures. Additionally, understanding the time complexity of these algorithms is crucial, as even small improvements in big O notation can lead to substantial performance gains in larger datasets. It's also important to test changes thoroughly to ensure that the optimizations do not introduce new bugs or regressions.

Real-World Example

In my previous role, we had a Python script that aggregated logs from multiple services for analysis. It was taking too long to run on a daily basis, impacting our reporting timeline. By profiling the script, we discovered that a specific loop was inefficiently processing data. I rewrote that part to use dictionary lookups instead of nested loops, which reduced the execution time from several minutes to under 30 seconds, allowing reports to be generated on time.

⚠ Common Mistakes

A common mistake is jumping to conclusions about what part of the code is slow without proper profiling. This can lead to wasted effort optimizing the wrong sections. Another mistake is neglecting to consider readability and maintainability when optimizing; more complex code can often become a maintenance burden. Additionally, developers may forget to test the performance of their solutions against a representative dataset, which can result in performance regressions when deployed in production.

🏭 Production Scenario

In a production environment, I once encountered a situation where an ETL process written in Python was taking too long every night, causing delays in data availability for our analytics team. The insights from our users relied heavily on timely data, which prompted an immediate need for optimization. Addressing this issue not only improved our workflow but also increased user satisfaction with our reporting capabilities.

Follow-up Questions

What specific profiling tools have you used in Python? Can you give an example of an algorithm you optimized? How do you ensure your optimizations maintain code readability? What steps do you take to test your optimizations??

ID: PY-JR-005 · Difficulty: 4/10 · Level: Junior

PY-INT-001 What is a list comprehension and when should you NOT use one? ▾

Python Core Python Intermediate

4/10

Answer

A list comprehension is a concise way to create lists using a single line expression. Avoid them when the logic is complex enough that a regular loop is more readable.

Deep Explanation

List comprehensions follow the syntax [expression for item in iterable if condition]. They are faster than equivalent for loops because they are optimized at the C level in CPython. However they are not always the right choice. Avoid them when: the logic requires multiple nested conditions you need to handle exceptions inside the loop the comprehension spans more than two lines when formatted or you are consuming a large dataset where a generator expression would be more memory-efficient. Nested list comprehensions (list comprehensions inside list comprehensions) are almost always a readability mistake.

Real-World Example

In a data processing pipeline: [user.email for user in users if user.is_active and user.verified] is clean and appropriate. But building a matrix transformation with three nested comprehensions is a maintainability trap — a regular loop with clear variable names is better for the next developer.

⚠ Common Mistakes

Nesting comprehensions three levels deep making code unreadable. Using list comprehensions when you actually need a generator (you are iterating once over a large dataset). Adding side effects inside comprehensions (modifying external state) which is a major anti-pattern.

🏭 Production Scenario

A memory crash in a production data export service was traced to a list comprehension processing 2 million records at once loading everything into memory. Replacing it with a generator expression fixed the memory issue without changing any other code.

Follow-up Questions

What is the difference between a list comprehension and a generator expression? How do dict comprehensions and set comprehensions work? What is the performance difference between a comprehension and a map() call??

ID: PY-INT-001 · Difficulty: 4/10 · Level: Intermediate

PY-MID-003 Can you explain how to use Python’s subprocess module for executing shell commands and how you would handle potential errors? ▾

Python DevOps & Tooling Mid-Level

5/10

Answer

Python's subprocess module allows you to spawn new processes, connect to their input/output/error pipes, and obtain their return codes. To handle errors, you can use try-except blocks and check the return code to ensure the command executed successfully.

Deep Explanation

The subprocess module is a powerful tool for managing system processes. You can use functions like subprocess.run(), subprocess.Popen(), or subprocess.call() to execute commands. Each of these functions allows you to capture output, handle errors, and manage process execution. It's essential to observe the return code; a return code of zero generally indicates success, while any non-zero indicates an error. You should also be cautious with shell injection attacks when passing commands or arguments that include user input. In such cases, prefer passing a list of arguments instead of a single string to mitigate risks.

Real-World Example

In a deployment script for a web application, I utilized the subprocess module to run deployment commands. I needed to execute a shell command that fetched the latest code from a repository. I used subprocess.run() and set the 'check' parameter to True, which raised a CalledProcessError if the command failed. This allowed me to log the error and gracefully handle the failure by reverting to the last stable state instead of crashing the entire deployment.

⚠ Common Mistakes

One common mistake is to neglect error handling, which can lead to unhandled exceptions if a command fails. Developers may also confuse the usage of subprocess.run() with subprocess.call() and not recognize that run() returns a CompletedProcess instance, not just the return code. Additionally, using shell=True can expose the application to shell injection vulnerabilities, especially if user input is included in the command string; it’s generally safer to use list arguments instead.

🏭 Production Scenario

In a recent production update, we faced issues when executing a subprocess command to deploy a new feature. The command failed due to insufficient permissions, but without proper error handling in our script, it crashed the entire deployment pipeline. This highlighted the need for robust subprocess management with error checks to ensure smooth deployments and avoid downtime.

Follow-up Questions

What are the differences between subprocess.run() and subprocess.Popen()? How would you manage standard output and error when using subprocess? Can you explain how to avoid shell injection vulnerabilities when using subprocess? What considerations should you have when running subprocess commands in a multi-threaded environment??

ID: PY-MID-003 · Difficulty: 5/10 · Level: Mid-Level

PY-MID-002 Can you explain how to manage package dependencies in Python projects and what tools you would use? ▾

Python Frameworks & Libraries Mid-Level

5/10

Answer

To manage package dependencies in Python projects, I recommend using virtual environments combined with pip and a requirements.txt file. This keeps dependencies isolated and manageable across different projects.

Deep Explanation

Managing package dependencies is crucial in Python development to avoid conflicts between libraries and ensure that your application runs smoothly in different environments. A virtual environment, created using tools like venv or virtualenv, allows you to create an isolated space for your project dependencies, preventing version clashes with globally installed packages. Additionally, using pip along with a requirements.txt file helps to specify exact versions of dependencies, enabling consistent installs across development, testing, and production environments. It's good practice to regularly update your dependencies and review them for security vulnerabilities, as outdated packages can introduce risks to your application.

Another important aspect of dependency management is understanding the differences between 'requirements.txt' and 'Pipfile'. While requirements.txt is straightforward, Pipenv, which utilizes Pipfile, offers a higher-level dependency management tool that automatically manages virtual environments and simplifies the installation and locking of packages with Pipfile.lock. This can enhance project reproducibility and ease collaboration among team members.

Real-World Example

In a recent project, we were developing a web application using Flask. We set up a virtual environment to manage our dependencies, allowing us to use specific versions of Flask and its extensions without affecting other projects. We maintained a requirements.txt file that listed the core packages and their respective versions, which was essential when deploying the app to different environments such as staging and production. This approach helped avoid compatibility issues and ensured that all team members had the same setup during development.

⚠ Common Mistakes

One common mistake is neglecting to use virtual environments, which can lead to conflicts with globally installed packages and make dependency management cumbersome. Developers often find themselves troubleshooting version issues that could have been avoided. Another mistake is failing to specify exact package versions in requirements.txt. This can lead to unexpected behavior in production if a newer version of a dependency contains breaking changes. Maintaining consistency in dependency versions is key to ensuring reliable application performance.

🏭 Production Scenario

Imagine a situation where you're deploying a Python web application to production, and it starts throwing errors due to a library version mismatch that wasn't present in development. This can happen if you skip using a virtual environment or if you don’t lock your package versions. Understanding how to manage dependencies effectively would be crucial in avoiding such headaches and ensuring a smooth deployment process.

Follow-up Questions

How would you handle dependency conflicts in a project? Can you explain the difference between requirements.txt and Pipfile? What tools do you use to ensure your dependencies are secure? Have you ever faced any issues with dependencies in production??

ID: PY-MID-002 · Difficulty: 5/10 · Level: Mid-Level

PY-MID-001 Can you explain what Flask is and how it differs from Django in terms of building web applications? ▾

Python Frameworks & Libraries Mid-Level

5/10

Answer

Flask is a lightweight WSGI web application framework for Python that is designed to make it easy to get a project up and running with minimal setup. Unlike Django, which is a full-featured framework that includes an ORM and admin interface out of the box, Flask provides more flexibility and simplicity by allowing developers to choose their tools and libraries.

Deep Explanation

Flask operates on the principle of being minimalistic and modular. It allows developers to start with a single file and incrementally add functionality as needed, which makes it great for small to medium-sized applications or microservices. Its simplicity provides a lower learning curve for beginners and gives greater control for experienced developers to tailor their setup. However, this also means that developers need to make more decisions about things like database integration and user authentication that would come out of the box in Django, which can introduce complexity in larger projects. Ultimately, the choice between Flask and Django should depend on project requirements, team familiarity, and the desired level of abstraction in application architecture. Developers need to weigh the benefits of Flask's flexibility against Django's rapid development capabilities and built-in features.

Real-World Example

In a recent project at my company, we built a lightweight API service using Flask due to its simplicity. We had specific requirements for integrating custom authentication and RESTful routes. By using Flask, we could easily incorporate extensions like Flask-RESTful and Flask-JWT without the overhead of a large framework. The team appreciated how quickly we could iterate during development while maintaining control over the components we integrated, which would have been more rigid in Django.

⚠ Common Mistakes

A common mistake developers make when choosing between Flask and Django is underestimating the scope of the project. Flask seems appealing for its ease of use, but for larger applications that require built-in features like ORM and admin panels, developers might end up writing excessive boilerplate code. On the other hand, some may choose Django for small applications and end up dealing with unnecessary overhead, which complicates deployment and maintenance. It’s important to align the framework choice with project needs, rather than personal preference alone.

🏭 Production Scenario

In a production environment, I have seen teams struggle with managing dependencies and configurations when using Flask for larger applications. As teams expand and the application grows, the initial flexibility of Flask can turn into a challenge, as decisions made early on about the libraries and architecture may not scale well. Proper planning and regular code reviews are crucial to avoid pitfalls as the project matures.

Follow-up Questions

What are some common Flask extensions you have used? How do you handle database migrations in Flask? Can you discuss a time when Flask's flexibility caused challenges in a project? How would you compare the performance of Flask vs. Django??

ID: PY-MID-001 · Difficulty: 5/10 · Level: Mid-Level

PY-INT-004 How do context managers work and how do you create a custom one? ▾

Python Core Python Intermediate

5/10

Answer

Context managers use __enter__ and __exit__ methods to manage setup and teardown of resources. The 'with' statement calls these automatically ensuring cleanup even if an exception occurs.

Deep Explanation

When you use 'with open(file) as f' Python calls f.__enter__() to set up and f.__exit__() to clean up. You can create custom context managers two ways: implement __enter__ and __exit__ in a class or use the @contextmanager decorator from contextlib with a generator function that yields once. The __exit__ method receives exception information and can suppress exceptions by returning True. Context managers are the Pythonic way to handle any resource that needs guaranteed cleanup: database connections locks temporary directories timers and transaction management.

Real-World Example

A database transaction context manager in a Django-like ORM: __enter__ begins the transaction __exit__ commits if no exception occurred or rolls back if one did. This pattern ensures no transaction is ever left open regardless of what happens inside the with block.

⚠ Common Mistakes

Not handling exceptions in __exit__ letting them propagate when they should be caught. Creating context managers with @contextmanager and forgetting to wrap the yield in try-finally skipping cleanup on exceptions. Using try-finally everywhere instead of the cleaner with statement.

🏭 Production Scenario

A production PostgreSQL service had intermittent connection failures traced to database transactions being left open. The root cause was exception handling that bypassed the connection cleanup code. Refactoring to use a context manager with proper __exit__ eliminated the issue permanently.

Follow-up Questions

What is the contextlib module? How do nested context managers work? What is contextlib.ExitStack used for??

ID: PY-INT-004 · Difficulty: 5/10 · Level: Intermediate

PY-DS-001 What is the difference between pandas DataFrame.apply() and vectorized operations? ▾

Python Data Science Intermediate

5/10

Answer

Vectorized operations (using NumPy/pandas built-ins) operate on entire arrays at once in optimized C code. apply() calls a Python function row by row or column by column in pure Python. Vectorized operations are 10-1000x faster; use apply() only when no vectorized alternative exists.

Deep Explanation

pandas is built on NumPy which stores data in contiguous memory arrays and performs operations in optimized C/FORTRAN code without Python overhead. When you write df['price'] * 1.1 NumPy multiplies the entire array in C. When you write df.apply(lambda x: x['price'] * 1.1 axis=1) Python calls a function for every single row — potentially millions of function calls with Python overhead each time. The performance gap is enormous: for a 1M row DataFrame vectorized operations might take 10ms while apply() takes 10-30 seconds. Use apply() only for: operations that cannot be expressed vectorially complex multi-column operations with conditional logic or when applying a function that expects a Series object.

Real-World Example

A daily sales report generation for a retail chain was taking 45 minutes to run on a 5M-row transaction DataFrame. Profiling revealed three apply() calls doing price calculations that could be rewritten as vectorized operations. Replacing them reduced runtime to 90 seconds — a 30x speedup with no algorithmic change.

⚠ Common Mistakes

Using apply() for simple arithmetic that pandas/NumPy can do natively. Using apply(axis=1) to iterate rows for anything that can be done with vectorized conditionals (use np.where instead). Not knowing about str accessor methods (df['col'].str.contains()) which provide vectorized string operations avoiding apply() entirely.

🏭 Production Scenario

A pandas ETL pipeline at a financial data company was processing end-of-day data and regularly missing the 6 AM business deadline. Profiling showed apply() calls for currency conversion and date parsing were the bottleneck. Replacing with vectorized arithmetic and pd.to_datetime() reduced the pipeline from 4 hours to 18 minutes.

Follow-up Questions

What is the difference between apply() and applymap()? How does numpy.vectorize() differ from true vectorization? When should you use Polars instead of pandas??

ID: PY-DS-001 · Difficulty: 5/10 · Level: Intermediate

PY-INT-008 How do Python dictionaries work internally and what is their time complexity? ▾

Python Core Python Intermediate

5/10

Answer

Python dictionaries are hash tables. Lookup insertion and deletion are O(1) average case. Hash collisions can degrade this to O(n) worst case but Python's implementation makes this extremely rare. Python 3.7+ guarantees insertion-order preservation.

Deep Explanation

Dictionaries store key-value pairs in a hash table. When you set d[key] = value Python computes hash(key) maps it to a bucket and stores the value. When you access d[key] Python recomputes the hash and looks up the bucket directly — O(1). Hash collisions (two different keys mapping to the same bucket) are resolved via open addressing in CPython. Python 3.6 introduced a compact dictionary representation that stores insertion order as a side effect. Python 3.7 made insertion order preservation official. Only hashable objects can be dictionary keys (immutable types: strings integers tuples — but not lists or other dicts). dict.get(key default) avoids KeyError for missing keys. collections.defaultdict automatically creates default values. collections.Counter counts hashable objects.

Real-World Example

In a word frequency counter processing millions of log lines dict-based counting with Counter outperforms sorting-based approaches by orders of magnitude — O(n) with hash table vs O(n log n) for sort-then-count. In a URL routing system a dict of {path: handler} enables O(1) route lookup regardless of how many routes exist.

⚠ Common Mistakes

Using a list to check membership (if item in list is O(n) — use a set or dict instead). Modifying a dictionary while iterating over it (raises RuntimeError — iterate over list(d.items()) instead). Using mutable objects as dictionary keys (unhashable type TypeError). Not using setdefault() or defaultdict() and writing verbose if-key-in-dict patterns instead.

🏭 Production Scenario

A production request deduplication service was checking if a request ID had been seen using a list (if request_id in seen_list). At 10000 requests per second the O(n) membership check was consuming 60% of CPU time. Replacing with a set (O(1) lookup) reduced CPU usage to 2% with identical functionality.

Follow-up Questions

How does Python set differ from dict internally? What is the difference between dict and OrderedDict after Python 3.7? What is dict comprehension and when should you use defaultdict instead??

ID: PY-INT-008 · Difficulty: 5/10 · Level: Intermediate

PY-INT-002 How do decorators work in Python and what is the functools.wraps issue? ▾

Python Core Python Intermediate

5/10

Answer

A decorator is a function that wraps another function to add behavior. Without functools.wraps the wrapper loses the original function's metadata like __name__ and __doc__.

Deep Explanation

Decorators work by taking a function as input and returning a new function that adds behavior before or after the original call. The syntax @decorator is syntactic sugar for function = decorator(function). The core problem is that the returned wrapper function has its own identity — its __name__ is 'wrapper' not the original function's name. This breaks logging debugging and documentation tools. functools.wraps(original_func) applied to the wrapper copies the original function's metadata to the wrapper. This is especially critical in Flask and FastAPI where the routing system uses function names to identify view functions — without wraps all decorated routes have the same name and only one will be registered.

Real-World Example

In a Flask application a custom authentication decorator without functools.wraps caused all protected routes to map to the same endpoint name 'wrapper' making url_for() return wrong URLs and breaking the entire navigation system. Adding @functools.wraps(f) to the inner wrapper function fixed it immediately.

⚠ Common Mistakes

Forgetting @functools.wraps on the inner wrapper function. Decorators that do not preserve the function signature breaking tools that inspect function parameters. Applying decorators in the wrong order when stacking multiple decorators.

🏭 Production Scenario

A production Flask API broke its authentication after a refactor added a logging decorator without functools.wraps. The route registration system saw multiple routes all named 'wrapper' and silently dropped all but one making several API endpoints return 404 despite the code being correct.

Follow-up Questions

How do class-based decorators work? How do you write a decorator that accepts its own arguments? How does decorator stacking (applying multiple decorators) work in Python??

ID: PY-INT-002 · Difficulty: 5/10 · Level: Intermediate

PY-INT-003 What is the GIL in Python and how does it affect multithreading? ▾

Python Performance Intermediate

6/10

Answer

The Global Interpreter Lock (GIL) is a mutex that prevents multiple native threads from executing Python bytecode simultaneously. It makes Python threads unsuitable for CPU-bound parallelism.

Deep Explanation

CPython (the standard Python implementation) uses reference counting for memory management. The GIL protects this reference counting from race conditions by ensuring only one thread executes Python code at a time. This means Python threads do NOT run in true parallel for CPU-bound tasks — they take turns. However the GIL is released during I/O operations (file reads network calls database queries) so threading IS effective for I/O-bound tasks. For true CPU parallelism use the multiprocessing module which spawns separate processes each with their own GIL or use libraries like NumPy that release the GIL in their C extensions.

Real-World Example

A web scraper using threading to fetch 100 URLs runs significantly faster with threads because most time is spent waiting for network I/O (GIL released). The same approach for parsing and processing 100 large JSON files (CPU-bound) would see no speedup from threading — multiprocessing or concurrent.futures ProcessPoolExecutor should be used instead.

⚠ Common Mistakes

Using threading for CPU-intensive tasks and being confused when there is no performance improvement. Assuming multiprocessing will always be better — it has high overhead for process spawning and IPC. Not considering asyncio for I/O-bound tasks which is more efficient than threading for high-concurrency scenarios.

🏭 Production Scenario

A production image processing service used Python threading expecting parallel image resizing. Performance was identical to single-threaded execution. The fix was switching to multiprocessing.Pool which reduced processing time by 75% on an 8-core server by actually utilizing all cores.

Follow-up Questions

What is the difference between threading multiprocessing and asyncio? When does Python release the GIL? Does Jython or PyPy have a GIL??

ID: PY-INT-003 · Difficulty: 6/10 · Level: Intermediate

PAGE 2 OF 4 · 50 QUESTIONS TOTAL