Connect with Best Verified Coaching in India Find the best Verified coaching institutes in India , personalized to your needs.
For which entrance exam are you preparing? mandatory

Data Science Projects on GitHub That You Must Add to Your Portfolio

Report Error
article image

Improving proficiency in data science is easy. You can join a course of PGP in Data Science, read self-help books, or sift through articles and online courses. It all depends upon how you work on developing conceptual knowledge while continually indulging in the practice of regular work routine.

It will enhance your chances of landing a promising career opportunity in data science. You can fast-track your growth in data science by pursuing a PGP certification while practicing and building open-source projects present on GitHub.

Image Courtesy: Shutterstock

GitHub is a platform that brings together developers from around the world to create and share open-source software development projects. GitHub is primarily a code hosting platform for version control and collaboration.

Experienced programmers have put their open-source codes on GitHub, which anyone can access to understand and build their model. You can also learn here about recent breakthroughs part from building projects. Furthermore, you can augment your learning by enrolling in a PGP in data science course, which will give you the necessary platform to develop your portfolio.

To help build your data science portfolio, here are some GitHub projects to consider:

Natural Language Projects (NLP)

NLP is booming with many breakthroughs. Once you start going through them, you will realize it’s hard to keep up with the pace of new frameworks. There are many projects for you to experiment with and gain experience.

    It is a light version of BERT. If you are not aware of the BERT framework, it was developed by Google that transformed NLP overnight. Original BERT framework is massive in size, which won’t run on local machines unless you have GPUs lying around, which lead to the creation of ALBERT. It is used for building language models that perform all tasks with only 30% parameters.
  2. String Sifter
    It is a machine learning tool that automatically ranks strings for malware analysis, making it one of the fascinating projects of data science. A malware program often contains strings to perform various operations like copying a file to specific location or registering key. String Sifter provides crucial information that can help build strong malware detection programs.
  3. PLMpapers
    It is a collection of papers on Pertained Language Models which allows us to use the existing model and play around with it. PLMpapers repository is a collection of over 60 papers, which include BERT, XLNet, ERNIE, ULMfit, among various others.

Computer Vision Projects

Have you ever heard of image or video data and worked with them? It is an advanced form of computer vision technique for which specialists are high in demand. In case you have prior knowledge of computer vision projects, you can add a few GitHub projects on to your portfolio.

  1. Tiler
    A lot is being sought for the ability to work with image data in the industry, which is not coming as a surprise. It is unprecedented how images are uploaded and published these days, whose pace will only increase in the coming years. Tiler is a simple tool to create images using different small images or tiles. The possibilities of creating an image become endless which comes in all shapes and sizes.
  2. DeepPrivacy
    In today’s digital world, privacy is in short supply as every form of online activity is recorded, stored, analysed, and used for offering customized adds and product suggestions. One of the major drawbacks of our lack of privacy is the manipulation of images. DeepPrivacy is fully automatic anonymization.

These are some of the data science projects which are on rage. While you need to learn about these projects or build them, you must also be aware of other projects on GitHub like TubeMQ, DeepCTR.

Leave your vote

0 points
Upvote Downvote

Total votes: 0

Upvotes: 0

Upvotes percentage: 0.000000%

Downvotes: 0

Downvotes percentage: 0.000000%

This post was created with our nice and easy submission form. Create your post!

Like what you read? Give author a thumbs up?

Bookmark this article to read later, drop a remark in comment section and share with your friends..

  • 0
  • 0
  • 0

By Sanya

article image

Texas Review

(7 Review) | (7Rating)

article image


(15 Review) | (8Rating)

article image

UTAC Academy

(5 Review) | (0Rating)


blog image

GET UPTO 50%    OFF!

Hey there!

Forgot password?

Don't have an account? Register

Forgot your password?

Enter your account data and we will send you a link to reset your password.

Your password reset link appears to be invalid or expired.


Processing files…

Verify Yourself

Report an Error

Page Title : Data Science Projects on GitHub That You Must Add to Your Portfolio

Please let us know, if you found any error on this page, We will rectify it as soon as possible