The Fake Jobs Detection project seeks to analyze the Fake Jobs Postings dataset from Kaggle. The dataset is a collection of real (label 0) and fraudulent (label 1) job postings. This project attempts to understand the characteristics of various techniques that separate actual job postings from fraudulent ones.

The analysis is presented as “blog” posts and interprets the classification techniques implemented in Python.

Useful URLs for this project

File and directory organization

  1. data/
    • This directory contains raw and processed data used by the classifiers.
  2. docs/
    • This directory contains the blog posts. It also has the configuration files used by Github pages to render the posts.
  3. notes/
    • This directory contains lab notes compiled during the analysis. These notes are used as raw material for blog posts.
  4. scripts/
    • The Python code for data processing that run on a local machine.
  5. admin/
    • Scripts and guidelines for administering the Github repository.
  6. LICENSE
    • This is the license under which all the work in the Fake Jobs Detection project is made available.
  7. README.md
    • This is the top-level markdown file that describes the project. Since all documentation is being maintained as blog posts, the file points to the blog.
  8. *.ipynb files
    • These are Google Colab Jupyter notebooks. Most classifiers are implemented on Colab (and not as scripts run on the local machine).