![]() |
---|
Image Generated Using Canva |
Tired of manually skimming through resumes to match the right candidate? In today’s world, AI can do a first-level screening by ranking resumes based on their relevance to a job description. In this project, we'll build a smart resume ranking system using TF-IDF and cosine similarity, simple NLP techniques that can save a lot of time.
import os
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity
# Sample job description
job_desc = """
We are looking for a Python developer with experience in data analysis, pandas, NumPy,
machine learning, and writing clean, maintainable code.
"""
# Sample resumes
resumes = [
"Experienced Python developer with knowledge of pandas, NumPy, and machine learning models.",
"Software engineer skilled in Java, C++, and cloud services like AWS and Azure.",
"Data scientist with strong Python skills, data cleaning, pandas, sklearn, and model evaluation.",
"Content writer with a strong verbal and written skills.",
"I have no experience in programming",
"Python developer with experience in data analysis, pandas, NumPy, machine learning, and writing clean, maintainable code."
]
# Combine all documents
documents = [job_desc] + resumes
# TF-IDF Vectorization
vectorizer = TfidfVectorizer()
tfidf_matrix = vectorizer.fit_transform(documents)
# Calculate cosine similarity between job description and each resume
cosine_similarities = cosine_similarity(tfidf_matrix[0:1], tfidf_matrix[1:]).flatten()
# Rank resumes
ranking = sorted(list(enumerate(cosine_similarities)), key=lambda x: x[1], reverse=True)
# Display results
print("Resume Ranking:")
for idx, score in ranking:
print(f"Resume {idx+1} - Relevance Score: {score:.2f}")
Resume Ranking: Resume 6 - Relevance Score: 0.82 Resume 1 - Relevance Score: 0.35 Resume 3 - Relevance Score: 0.21 Resume 5 - Relevance Score: 0.13 Resume 2 - Relevance Score: 0.07 Resume 4 - Relevance Score: 0.06
TF-IDF captures how important each word is in each resume relative to the job description.
Cosine similarity then measures how closely the resume matches the job.
This is a scalable way to automate initial screening in HR workflows or freelance platforms.
Each resume is compared to the job description using text similarity, and a Relevance Score between 0 and 1 is generated:
A score closer to 1 means very relevant (e.g., 0.95 = strong match)
A score around 0.5 means moderately relevant
A score below 0.3 means weak or poor match
Think of it as a percentage match: just multiply by 100 to get an idea:
0.38 → 38% match
0.23 → 23% match
0.15 → 15% match
AI doesn’t always have to be complicated. This simple NLP-based resume ranker shows how machine learning concepts can create real-world impact, especially in recruitment, HR tech, or freelance hiring systems. You can expand this further by adding PDF parsing, named entity recognition (NER), or using large language models for deeper understanding. Thanks for reading my article, let me know if you have any suggestions or similar implementations via the comment section. Until then, see you next time. Happy coding!