Biopython SeqRecord Fix

DodaTech Updated 2026-06-26 2 min read

You will learn how to create sequence records with metadata annotations.

The Problem

The bioinfo seq record pattern is frequently misapplied by data scientists and Python developers, leading to runtime errors, incorrect results, or inefficient code. This quick-fix guide shows the correct implementation and common pitfalls to avoid when working with BIOINFO in Python.

The Wrong Way

The most common mistake is using the wrong method signature, incorrect parameters, or misunderstanding the underlying data structure. Here is what typically goes wrong:

from Bio.Seq import Seq
from Bio.SeqRecord import SeqRecord
record = SeqRecord(Seq('ATGC'), id='test001', description='Test sequence')
print(record.id, record.description)

What happens: test001 Test sequence

This approach fails because the API contract is violated -- parameters are passed in the wrong order, the input shape doesn't match expectations, or the method is called on an incompatible object type.

The Right Way

The correct approach uses the proper API with the right parameters. Here is the fixed version:

record.annotations['organism'] = 'Homo sapiens'
record.annotations['mol_type'] = 'genomic DNA'
print(record.annotations)

Expected output:

{'organism': 'Homo sapiens', 'mol_type': 'genomic DNA'}

Step-by-Step Fix

1. Understand the data types and shapes

Before applying any operation, verify the data types and shapes of your inputs. In Python Data Science, most errors come from type or shape mismatches.

# Always inspect your data first
print(type(data))
print(data.shape if hasattr(data, 'shape') else 'No shape')
print(data.dtype if hasattr(data, 'dtype') else 'No dtype')

2. Apply the correct method with proper arguments

Use the corrected code shown above. Pay special attention to keyword arguments that control behavior like axis, inplace, or how.

3. Verify the result

Always validate that the output matches expectations before proceeding:

# Verification pattern
result = perform_operation(data)
assert some_condition(result), "Operation failed unexpectedly"
print(f"Success: {result.shape if hasattr(result, 'shape') else result}")

Prevention Tips

Use SeqRecord(seq, id='...', description='...') for sequence records: Use SeqRecord(seq, id='...', description='...') for sequence records
Add annotations via record.annotations dict for metadata: Add annotations via record.annotations dict for metadata
Use record.features for feature table as SeqFeature objects: Use record.features for feature table as SeqFeature objects
Use record.dbxrefs for cross-references to databases: Use record.dbxrefs for cross-references to databases
Use record.letter_annotations for per-position annotations: Use record.letter_annotations for per-position annotations

Common Mistakes

Creating SeqRecords without id (raises error on write to FASTA/GenBank) - Creating SeqRecords without id (raises error on write to FASTA/GenBank)
Modifying annotations on slices creates shallow copies (use .copy() for independent copy) - Modifying annotations on slices creates shallow copies (use .copy() for independent copy)

These mistakes appear frequently in real-world bioinfo code. DodaTech's contributors have identified these patterns through analysis of open-source projects, production systems, and community forums like Stack Overflow.

Practice Exercise

Create a SeqRecord with annotations for a human gene sequence and write it to both FASTA and GenBank formats.

This exercise reinforces the concepts covered in this guide. Try implementing it before checking online solutions. This hands-on approach ensures you retain the knowledge and can apply it independently.

FAQ

### What is the difference between id and name?

id is the primary identifier (accession number); name is a short display name.

How do I add features?

Use SeqFeature objects with FeatureLocation and add to record.features.

Can SeqRecord store quality scores?

Yes. Use record.letter_annotations['phred_quality'] for FASTQ quality scores.

Built by the developers of Doda Browser, DodaZIP, and Durga Antivirus Pro. DodaTech tools integrate seamlessly with Python Data Science workflows for enhanced productivity and security.

← Previous Biopython Sequence Mutation Fix Next → Biopython Sequence Alignment Fix

Built by the developers of DodaTech

Doda Browser, DodaZIP & Durga Antivirus Pro

Home Browse Quick Fix