Databricks Notebook Import Error Fix
In this tutorial, you'll learn about Databricks Notebook Import Error Fix. We cover key concepts, practical examples, and best practices.
You import a notebook to Databricks and get Library not installed: pandas or ModuleNotFoundError: No module named 'databricks-sdk' — the cluster does not have the required Python libraries, or the library version is incompatible with the runtime.
Step-by-Step Fix
1. Check installed libraries on the cluster
# Run in a notebook cell
import pkg_resources
packages = [d for d in pkg_resources.working_set]
sorted_packages = sorted([f"{p.key}=={p.version}" for p in packages])
sorted_packages[:20]
Expected output lists installed packages with versions.
2. Install missing libraries with %pip
# Wrong — using conda install in a notebook
# %conda install pandas
# Right — use %pip magic for installation
%pip install pandas==2.1.0 numpy==1.24.3
3. Install libraries at the cluster level
In Databricks UI: Clusters > Libraries > Install New > PyPI > enter package name.
Or via API:
# Wrong — installing per session
%pip install great-expectations
# Right — install as cluster library (persistent)
import requests
response = requests.post(
f"https://<workspace>.cloud.databricks.com/api/2.0/libraries/install",
json={
"cluster_id": "1234-567890-cluster123",
"libraries": [{"pypi": {"package": "great-expectations"}}]
}
)
4. Fix import paths
# Wrong — wrong import path for notebooks
from utils.my_module import helper
# Right — add the parent directory to path
import sys
sys.path.append("/Workspace/Users/myuser/utils")
from my_module import helper
Common Mistakes
| Mistake | Fix |
|---|---|
| Installing libraries per session instead of cluster | Install at cluster level for persistence |
Using !pip install instead of %pip |
Use %pip which ensures the package is installed in the notebook's Python environment |
| Library version conflicts with Databricks runtime | Check Databricks runtime compatibility matrix |
| Importing local files without path setup | Add the file path to sys.path or upload as cluster library |
| Missing init scripts for custom packages | Use cluster init scripts for pre-installed libraries |
Prevention
- Install libraries at cluster level for production workloads.
- Use
%pipin notebooks for ad-hoc installations. - Document required libraries in the cluster configuration.
- Use Unity Catalog volumes to share code across notebooks.
DodaTech Tools
Doda Browser's notebook manager organizes Databricks notebooks and tracks library dependencies across workspaces. DodaZIP compresses and encrypts notebook exports for secure sharing. Durga Antivirus Pro scans notebook code for hardcoded credentials and security vulnerabilities.
Common Mistakes with notebook error
- Forgetting
deriving (Show, Eq)on custom data types needed for debugging - Placing the wildcard pattern first in case expressions, making all subsequent patterns unreachable
- Using
headandtailinstead of pattern matching, causing runtime errors on empty lists
These mistakes appear frequently in real-world DATABRICKS code. DodaTech's contributors have identified these patterns through analysis of open-source projects and production systems.
Practice Exercise
Write a pure function that safely divides two integers using Maybe, then test it with edge cases like division by zero and negative numbers.
This exercise reinforces the concepts covered in this guide. Try implementing it before checking online solutions.
FAQ
Built by the developers of DodaTech
Doda Browser, DodaZIP & Durga Antivirus Pro