Ensembles of other models
Finally, we create ensembles from some of the other models that we have attempted in this project.
Averaging Ensemble
There are many techniques to combine individual models into an ensemble (1).
We first attempt a straightforward technique that averages the probability of a job being fraudulent (2) across the bag-of-words (3), LSTM (4), and Transformer (5) models. We declare a job to be fraudulent if the average probability is greater than 0.15.
We find that the F1 score (89.13%) remains similar to that of the individual models (88.64%, 88.12%, 84.13%).
Stacked Ensemble
Next, we combine individual models using a more sophisticated technique - stacking. Here we take the probabilities returned by the bag-of-words, LSTM, and Transformer models and pass them to logistic regression as “features.” Logistic regression returns the final fraudulent / not-fraudulent verdict.
Unfortunately, the F1 score does not improve. In fact, the F1 score drops to 84.80%.
Persistent False Negatives
From our experimentation thus far, it appears that it is difficult to improve the F1 score beyond (approximately) 88%. We wonder if this plateau’s underlying reason is simply that some fraudulent job descriptions are crafted well enough to be indistinguishable from real job descriptions.
We look at one such false-negative reported by the averaging ensemble. The full entry from the original CSV file is copied below:
2445,
Urgent Requirement For The Position Technical Lead - Rhomobile Technical Mobility Lead Developer,
“US, CA, “,
,
55-65,
,
“1. Technical Lead - Rhomobile Technical Mobility Lead Developer NOTES- In case if Rhomobile is difficult to find, Please find people with expertise in Android development with, DOJO, CSS Location: San Francisco/ San Ramon, CADuration: 1 year Required Competencies: Should have around 6-10 yrs of IT Experience- out of which minimum 2yrs as Tech Lead of Mobile ProjectsExperience in RhoMobile Platform for atleast 6 MonthsShould be experienced in Cross Platform Mobile apps.Proficiency in atleast one Native Platform – iOS or AndroidHands-on Experience essentialShould be able to lead a team and guide them in Technical challengesShould be able to estimate and plan for the various activitiesIntegration Experience with Backend Systems Essential. Out of this , SAP Integration Experience is desirableESRI ArcGIS Knowledge Essential Thanks and Regards J.SandeepTechnical RecruiterTekWissen LLC W: #PHONE_b464fe6050e48f0c36d00501265378e9581d5d65c73f8e39865543c69aaab557# Desk: #PHONE_46ed5da44d683bbb700c54f51cd225fc80203b64e1c674f7e6d9f826a0223f31# Ext 299 Fax #PHONE_c92713f1c7155e946cc5b07c854ce554f3c95f79f2cbd98500e3cfa2db4ba406# #EMAIL_5f904f421f564976259f740a5c3ee9868cab5264ce252cf686be79da28bffd45# 321 S Main street Suite #300 Ann Arbor MI – 48104 #URL_3b7a90492e728246f2b56db07b079519d22b68faf5d8bbbd4e11a827db01e3ea#”,
“Requirement / JD: Drive the mobile app development work.Gather requirements from the client’s product management teamSuggesting the right Architecture to be adoptedTechnical Design of the AppCo-ordinate with the offshore team and get the work done from offshoreShould be able to Architect, design, code and deploy the applications ”,
,
0,
0,
1,
Contract,
Mid-Senior level,
Bachelor’s Degree,
Information Technology and Services,
Information Technology,
1
The job description appears genuine enough, even though it seems to have some syntax errors such as inadequate separation between words (“CADuration:” instead of “CA. Duration:”). Such errors may be due to the script that the dataset creators used to “scrape” the data from job websites.