How to Do a Systematic Literature Review: A Practical Workflow for Scientists

Step-by-step workflow for conducting a thorough literature review with search strategy, screening, and synthesis.

A literature review is not something you do once at the start of your PhD and forget about. It’s an ongoing process. Every time you begin a new project, you need to understand what’s already known. Every time you write a grant, you need to justify your proposed work against existing research. Every time you publish, you need to position your findings in the current landscape.

Most scientists do literature reviews informally. You search PubMed, skim abstracts, read 20 papers, and conclude you understand the field. This works for background knowledge but fails when you need thoroughness. If you’re writing a thesis chapter, a grant, or a published review paper, your literature review must be systematic. It must be reproducible. Someone should be able to follow your search strategy and arrive at approximately the same papers you found.

This guide walks you through conducting a systematic literature review that’s rigorous but practical. You’re not necessarily aiming for a PRISMA-compliant (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) formal systematic review ready for publication. Instead, you’re doing a thorough, well-documented review that you can stand behind for a thesis, grant, or research planning.

Section 1: Defining Your Question Before You Start

The most common mistake is starting your search before you know exactly what you’re searching for. You have a vague topic, you search PubMed, you get 10,000 results, and you’re paralyzed.

Define your question precisely before searching.

For clinical topics, use the PICO framework:

  • Population: Who are the subjects? (e.g., colorectal cancer patients post-resection)
  • Intervention: What is being tested or done? (e.g., circulating tumor DNA monitoring)
  • Comparison: What is being compared to? (e.g., standard imaging surveillance)
  • Outcome: What are you measuring? (e.g., time to recurrence detection, overall survival)

PICO question: “In colorectal cancer patients post-resection (P), does circulating tumor DNA monitoring (I) compared to standard imaging (C) improve time to recurrence detection (O)?”

For basic science topics, PICO doesn’t always fit. Instead, define your research question clearly:

Research question: “What mechanisms underlie CAR-T cell exhaustion in solid tumors, and how do they differ from exhaustion in hematologic malignancies?”

Define your scope:

Will you include only peer-reviewed articles, or also preprints and grey literature? Will you include only English-language papers, or also non-English publications you can access? What years will you cover? Recent 5 years, or all available literature?

Example scope: “We will review English-language peer-reviewed articles published between 2018 and 2025 in PubMed. We will exclude case reports and editorials. We will include primary research and systematic reviews.”

Define inclusion and exclusion criteria:

Write these down before you start searching. This prevents bias. Example:

Include:

  • Original research (clinical trials, observational studies, lab studies)
  • English language
  • Human subjects or relevant animal models (mice, non-human primates)
  • Published 2018 or later

Exclude:

  • Review articles
  • Case reports (fewer than 5 subjects)
  • Editorials, opinion pieces
  • Papers without accessible abstract

Section 2: Building Your Search Strategy

A good search strategy is the backbone of a systematic review. It balances sensitivity (finding all relevant papers) with precision (avoiding too many irrelevant papers).

Search in PubMed:

Go to PubMed Advanced Search. This gives you more control than the simple search box.

Use MeSH terms (Medical Subject Headings) to capture concepts even when papers use different terminology. For your search, combine concepts with AND/OR/NOT:

Concept 1 (Population): colorectal cancer OR colon cancer (use MeSH: Colorectal Neoplasms) Concept 2 (Intervention): circulating tumor DNA OR ctDNA OR liquid biopsy (MeSH: Circulating Tumor DNA) Concept 3 (Outcome): recurrence OR prognosis OR survival

Combined search string:

(colorectal neoplasms [mesh] OR colon cancer [tiab])
AND
(circulating tumor DNA [tiab] OR ctDNA [tiab] OR "liquid biopsy" [tiab])
AND
(recurrence [tiab] OR prognosis [tiab] OR survival [tiab])
NOT
(review [publication type])

Field tags:

  • [mesh] searches only MeSH terms
  • [tiab] searches title and abstract
  • [pub type] filters by publication type

This search is more focused than “colorectal cancer recurrence” alone. You’ll get fewer irrelevant results.

Search other databases if available:

If your institution provides Embase access, search there too. Embase indexes more non-English journals and medical devices literature. Use similar search strings adjusted for Embase’s thesaurus.

For preprints and grey literature, search Google Scholar using the same concepts. Google Scholar indexes preprints on bioRxiv and medRxiv, which may be relevant if you’re covering recent research.

Document your searches:

Create a spreadsheet with:

  • Database (PubMed, Embase, Google Scholar)
  • Search date
  • Search string
  • Number of results
  • Notes

This documentation is essential for reproducibility. Someone reading your thesis or paper should be able to find your search string and retrieve approximately the same papers.

Section 3: Managing and Deduplicating Results

Your PubMed search returns results in a list. You need to export them to a spreadsheet or reference manager so you can track which papers you’ve screened.

Export from PubMed:

On your search results page, select all results (check the “all” box at the top). Click “Send to” and choose “File”. Select “NBIB” format (this is PubMed’s native format). Download the file.

Alternatively, select “CSV” or “Excel” if those are options. CSV is simpler to work with.

Import into a reference manager:

Zotero is free and open-source. Download it and create a folder for this literature review. Drag your exported PubMed file into Zotero, and it imports all papers with metadata (author, title, DOI, etc.).

Alternatively, use a simple spreadsheet (Google Sheets or Excel). Copy and paste the exported results into a spreadsheet with columns: Title, Authors, Year, Journal, Abstract, Notes, Included/Excluded.

Deduplicate:

Your searches across multiple databases may retrieve the same paper twice. Remove duplicates. In Zotero, use the “Duplicate items” feature (right-click a folder, find duplicates). In a spreadsheet, sort by title and manually check for duplicates.

After deduplication, document your numbers. Example: “Initial search returned 847 results. After deduplication, 723 unique papers remained.”

Section 4: Title and Abstract Screening

Now you screen papers to see which meet your inclusion criteria. This is a two-stage process. Stage 1 is title and abstract screening. Stage 2 is full-text review. This two-stage approach is faster and reduces bias.

Stage 1: Title and abstract screening

Read the title and abstract. Does the paper match your research question and inclusion criteria? Yes or no. You don’t read the full text yet.

Be inclusive at this stage. If you’re unsure, include it for full-text review. It’s better to over-include here and exclude later when you have more information.

Create a screening form or spreadsheet column with options: Include, Exclude, Uncertain. For papers you exclude, note why. Example: “Exclude: pediatric population, not adult post-resection colorectal cancer” or “Exclude: review article.”

Use a screening tool:

For larger reviews (100+ papers), use Rayyan, a free tool made for systematic reviews. You import your papers and screen them online. Rayyan tracks which papers you’ve screened and can calculate agreement between two independent screeners (important for reducing bias).

How to use Rayyan:

  1. Sign up (free).
  2. Create a new review project.
  3. Upload your CSV or import from PubMed/Zotero.
  4. Invite a colleague as a second screener (optional but recommended for rigorous reviews).
  5. Each screener reviews abstracts independently.
  6. Rayyan calculates Cohen’s kappa (agreement between screeners).
  7. You discuss disagreements and make final decisions.

If you’re the only screener (common for thesis chapters), just use Rayyan solo. It still tracks your decisions and documents your screening process.

Document screening results:

Report the number of papers at each stage:

  • Papers retrieved: 723
  • After title/abstract screening: 156
  • After full-text review: 42
  • Final included: 38 (4 excluded for missing full text)

This transparency shows you did a thorough, systematic review.

Section 5: Full-Text Review and Data Extraction

For papers that passed title/abstract screening, you now read the full text. This is where you confirm inclusion and extract data.

Stage 2: Full-text review

Download the full PDFs. Read the full paper, not just the abstract. Confirm inclusion against your criteria. Document any papers you exclude at this stage and why. Common reasons: different population than specified, methodology doesn’t match, outcomes not relevant, insufficient data quality.

Data extraction:

Create a data extraction template. This is a form or spreadsheet with columns capturing the key information from each paper. Your columns depend on your question, but generally include:

  • Study design (RCT, observational, case-control, lab study)
  • Sample size
  • Population characteristics
  • Intervention/exposure description
  • Comparison/control group
  • Primary outcomes measured
  • Key findings (with numbers)
  • Limitations (noted by authors or you)
  • Funding source
  • Study quality score (if using a quality assessment tool)

Worked example for a colorectal cancer review:

PaperDesignNPopulationInterventionOutcomeKey FindingLimitationsStudy Quality
Smith 2023Prospective cohort150CRC post-resection, stage II-IIIctDNA monitoring every 6 weeksTime to recurrenceMedian 3.2 months lead time before imaging (p<0.01)Single center, limited to early stage7/10
Jones 2022RCT200CRC post-resection, stage IIIctDNA-guided surveillance vs. standard imagingRFS (recurrence-free survival)No difference in RFS at 2 years (HR 0.94, p=0.4)Underpowered, only 2-year follow-up6/10

Enter data consistently. Use the same terminology across all papers (don’t mix “colorectal cancer” and “CRC” inconsistently).

Quality assessment:

For a rigorous review, assess the methodological quality of each paper. Use a validated tool like:

  • ROBINS-I (for observational studies)
  • Cochrane Risk of Bias tool (for RCTs)
  • Newcastle-Ottawa scale (simple alternative for observational studies)

These tools score papers on bias risk. A low-quality paper (high bias risk) carries less weight in your conclusions.

Section 6: Synthesis and Writing Your Review

Once you’ve extracted data from all included papers, you synthesize the findings. This means organizing and summarizing what you learned.

Narrative synthesis:

Organize your findings by theme, not by paper. Don’t write: “Smith found X, Jones found Y, Brown found Z.” Instead: “Three themes emerged from the literature: mechanism of action (Smith 2023, Jones 2022), clinical outcomes (Brown 2024), and manufacturing challenges (Davis 2023).”

Group papers by subquestions within your research question. Example, for a CAR-T cell review:

  • What causes CAR-T cell exhaustion in solid tumors?
  • How do T cell metabolism and nutrient availability affect persistence?
  • What clinical outcomes do different CAR-T designs show?

Write a section for each theme, synthesizing across papers.

Meta-analysis (awareness only):

If you have many RCTs with similar outcomes, you might do a meta-analysis: statistically combine effect sizes across studies to get an overall estimate. Meta-analysis is beyond this guide (requires specific statistical training), but you should know it exists. If you have 10+ RCTs measuring the same outcome, consult a statistician about meta-analysis. For most reviews, narrative synthesis is appropriate.

Identify gaps:

As you review, note what’s missing. What questions are answered well? What questions have almost no data? What contradictions exist in the literature?

Example: “Most studies measured short-term CAR-T cell expansion (0-3 months). Long-term persistence (>5 years) has been reported in only two small studies. This gap represents an important area for future research.”

Create a summary table:

Compile a table of all included studies. Authors, year, design, sample size, key findings. This table is invaluable for readers and reviewers.

Section 7: Keeping Your Review Updated

Once you finish your review, the field doesn’t stop publishing. If you’re writing a thesis chapter or planning a project, you need to monitor for new papers.

Set up automated literature alerts for your review topic. Monthly Google Scholar alerts or PubMed email alerts will catch new papers. Every 3-6 months, run your original search string again in PubMed. Screen any new papers using the same criteria. Update your synthesis with new findings.

This ongoing monitoring is often overlooked but essential. Your literature review is current on the day you write it. Three months later, new data might contradict your conclusions.

Section 8: Tools and Their Roles

Here’s a quick reference for the tools mentioned:

ToolPurposeFree?Notes
PubMedSearch biomedical literatureYesPrimary source, requires learning search syntax
Google ScholarBroad search including preprintsYesLess precise than PubMed, catches grey literature
ZoteroReference management, data organizationYesOpen-source, stores PDFs, integrates with Word
RayyanScreening and data managementYesDesigned for systematic reviews, tracks multiple screeners
EmbaseMedical research literature (indexed differently than PubMed)No (institutional access)More comprehensive than PubMed for some topics
Excel/Google SheetsSpreadsheet for data extractionYesSimple, works for small-medium reviews

Next Steps

Start your systematic review by writing down your research question and inclusion/exclusion criteria. Spend time on this step. A clearly defined question prevents you from drifting and searching too broadly. Once your question is pinned down, develop your search strategy and run it in PubMed and Google Scholar.

Export your results, deduplicate, and plan your screening. If you have more than 100 papers, set up a Rayyan account and do your screening there. It’s worth learning the tool. If you have fewer than 100 papers, a simple spreadsheet works fine.

Screen your papers, extract data consistently, and organize your synthesis by theme rather than paper. Document your process thoroughly so someone else could reproduce your review.

Finally, if you’re writing a formal systematic review paper for publication, read the PRISMA guidelines (available free online). PRISMA outlines how to report a systematic review comprehensively. It covers what to include in your methods, results, and discussion. Following PRISMA increases the credibility and reproducibility of your work.

If you want a textbook-level treatment of the full systematic review process — from protocol registration through meta-analysis and GRADE evidence rating — Systematic Approaches to a Successful Literature Review by Andrew Booth and colleagues covers every stage in depth.