Notebook

Build a Smart Resume Ranker with Python and Natural Language Processing¶

Use AI to evaluate and score resumes against a job description¶


Image Generated Using Canva

Introduction¶

Tired of manually skimming through resumes to match the right candidate? In today’s world, AI can do a first-level screening by ranking resumes based on their relevance to a job description. In this project, we'll build a smart resume ranking system using TF-IDF and cosine similarity, simple NLP techniques that can save a lot of time.

What You'll Learn¶

How to parse text from resumes and job descriptions
Clean and preprocess resume content
Convert documents into numerical form using TF-IDF
Calculate similarity scores using cosine similarity
Rank resumes based on relevance

Code Implementation¶

In [7]:

import os
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity

# Sample job description
job_desc = """
We are looking for a Python developer with experience in data analysis, pandas, NumPy,
machine learning, and writing clean, maintainable code.
"""

# Sample resumes
resumes = [
    "Experienced Python developer with knowledge of pandas, NumPy, and machine learning models.",
    "Software engineer skilled in Java, C++, and cloud services like AWS and Azure.",
    "Data scientist with strong Python skills, data cleaning, pandas, sklearn, and model evaluation.",
    "Content writer with a strong verbal and written skills.",
    "I have no experience in programming",
    "Python developer with experience in data analysis, pandas, NumPy, machine learning, and writing clean, maintainable code."
]

# Combine all documents
documents = [job_desc] + resumes

# TF-IDF Vectorization
vectorizer = TfidfVectorizer()
tfidf_matrix = vectorizer.fit_transform(documents)

# Calculate cosine similarity between job description and each resume
cosine_similarities = cosine_similarity(tfidf_matrix[0:1], tfidf_matrix[1:]).flatten()

# Rank resumes
ranking = sorted(list(enumerate(cosine_similarities)), key=lambda x: x[1], reverse=True)

# Display results
print("Resume Ranking:")
for idx, score in ranking:
    print(f"Resume {idx+1} - Relevance Score: {score:.2f}")

Resume Ranking:
Resume 6 - Relevance Score: 0.82
Resume 1 - Relevance Score: 0.35
Resume 3 - Relevance Score: 0.21
Resume 5 - Relevance Score: 0.13
Resume 2 - Relevance Score: 0.07
Resume 4 - Relevance Score: 0.06

Explanation¶

TF-IDF captures how important each word is in each resume relative to the job description.
Cosine similarity then measures how closely the resume matches the job.
This is a scalable way to automate initial screening in HR workflows or freelance platforms.

Understanding Resume Scores¶

Each resume is compared to the job description using text similarity, and a Relevance Score between 0 and 1 is generated:

A score closer to 1 means very relevant (e.g., 0.95 = strong match)
A score around 0.5 means moderately relevant
A score below 0.3 means weak or poor match

Think of it as a percentage match: just multiply by 100 to get an idea:

0.38 → 38% match
0.23 → 23% match
0.15 → 15% match

Conclusion¶

AI doesn’t always have to be complicated. This simple NLP-based resume ranker shows how machine learning concepts can create real-world impact, especially in recruitment, HR tech, or freelance hiring systems. You can expand this further by adding PDF parsing, named entity recognition (NER), or using large language models for deeper understanding. Thanks for reading my article, let me know if you have any suggestions or similar implementations via the comment section. Until then, see you next time. Happy coding!

Before you go¶

Be sure to Like and Connect Me
Follow Me : Medium | GitHub | LinkedIn | Python Hub
Check out my latest articles on Programming
Check out my GitHub for code and Medium for deep dives!