Learn R Programming

DSjobtracker

What skills and qualifications are required for data science related jobs?

Installation

You can install the development version from GitHub:

#install.packages("devtools")
devtools::install_github("thiyangt/DSjobtracker")
library(DSjobtracker)

Glimpse of tidy data

tibble::glimpse(DStidy)
Rows: 1,172
Columns: 109
$ ID                                 <dbl> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, …
$ Consultant                         <chr> "Thiyanga", "Jayani", "Jayani", "Ja…
$ DateRetrieved                      <dttm> 2020-08-05, 2020-08-07, 2020-08-07…
$ DatePublished                      <dttm> NA, 2020-07-31, 2020-08-06, 2020-0…
$ Job_title                          <chr> NA, "Junior Data Scientist", "Engin…
$ Company                            <chr> NA, "Dialog Axiata PLC", "London St…
$ R                                  <dbl> 1, 1, 0, 1, 0, 1, 1, 0, 0, 1, 0, 0,…
$ SAS                                <dbl> 1, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0,…
$ SPSS                               <dbl> 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0,…
$ Python                             <dbl> 1, 1, 1, 0, 0, 1, 1, 0, 0, 1, 0, 0,…
$ MAtlab                             <dbl> 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ Scala                              <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ `C#`                               <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ `MS Word`                          <dbl> 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 1,…
$ `Ms Excel`                         <dbl> 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 1,…
$ `OLE/DB`                           <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ `Ms Access`                        <dbl> 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1,…
$ `Ms PowerPoint`                    <dbl> 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0,…
$ Spreadsheets                       <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ Data_visualization                 <dbl> 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0,…
$ Presentation_Skills                <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ Communication                      <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ BigData                            <dbl> 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 1, 0,…
$ Data_warehouse                     <dbl> 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ cloud_storage                      <dbl> 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ Google_Cloud                       <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ AWS                                <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ Machine_Learning                   <dbl> 0, 1, 1, 0, 0, 1, 0, 0, 1, 0, 1, 0,…
$ `Deep Learning`                    <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ Computer_vision                    <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ Java                               <dbl> 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0,…
$ `C++`                              <dbl> 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0,…
$ C                                  <dbl> 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0,…
$ `Linux/Unix`                       <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ SQL                                <dbl> 1, 0, 0, 1, 1, 1, 1, 0, 1, 1, 1, 0,…
$ NoSQL                              <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ RDBMS                              <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ Oracle                             <dbl> 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0,…
$ MySQL                              <dbl> 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0,…
$ PHP                                <dbl> 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0,…
$ SPL                                <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ web_design_and_development_tools   <dbl> 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0,…
$ AI                                 <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ `Natural_Language_Processing(NLP)` <dbl> 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0,…
$ `Microsoft Power BI`               <dbl> 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ Google_Analytics                   <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ graphics_and_design_skills         <dbl> 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0,…
$ Data_marketing                     <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ SEO                                <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ Content_Management                 <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ Tableau                            <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0,…
$ D3                                 <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0,…
$ Alteryx                            <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ KNIME                              <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ Spotfire                           <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ Spark                              <dbl> 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0,…
$ S3                                 <dbl> 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0,…
$ Redshift                           <dbl> 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0,…
$ DigitalOcean                       <dbl> 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0,…
$ Javascript                         <dbl> 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0,…
$ Kafka                              <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ Storm                              <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ Bash                               <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ Hadoop                             <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0,…
$ Data_Pipelines                     <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ MPP_Platforms                      <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ Qlik                               <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ Pig                                <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ Hive                               <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0,…
$ Tensorflow                         <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ `Map/Reduce`                       <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ Impala                             <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ Solr                               <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ Teradata                           <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ MongoDB                            <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ Elasticsearch                      <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ YOLO                               <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ `agile execution`                  <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0,…
$ Data_management                    <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ pyspark                            <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ Data_mining                        <dbl> 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0,…
$ Data_science                       <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0,…
$ Web_Analytic_tools                 <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ IOT                                <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ Numerical_Analysis                 <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ Economic                           <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ Finance_Knowledge                  <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ Investment_Knowledge               <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ Problem_Solving                    <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ Team_Handling                      <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ Debtor_reconcilation               <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ Payroll_management                 <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ Bayesian                           <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ Optimization                       <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ Knowledge_in                       <chr> NA, NA, "Elasticsearch, Logstash, K…
$ City                               <chr> NA, "Colombo", "Colombo", "Colombo"…
$ Educational_qualifications         <chr> NA, "Degree in Engineering / IT or …
$ Salary                             <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ URL                                <chr> NA, "https://www.google.com/search?…
$ Search_Term                        <chr> NA, "Data Analysis Jobs in Sri Lank…
$ Job_Category                       <chr> "Unimportant", "Data Science", "Dat…
$ Experience_Category                <chr> "More than 2 and less than 5 years"…
$ Location                           <chr> NA, "Sri Lanka", "Sri Lanka", "Sri …
$ `Payment Frequency`                <chr> "unspecified", "unspecified", "unsp…
$ BSc_needed                         <dbl> 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1,…
$ MSc_needed                         <dbl> 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0,…
$ PhD_needed                         <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ `English Needed`                   <dbl> 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0,…
$ year                               <dbl> 2020, 2020, 2020, 2020, 2020, 2020,…

2021 survey data

df2021 <- get_data(2021)
df2021
# A tibble: 382 × 115
      ID Consultant URL        Search_Term DateRetrieved DatePublished Job_Field
   <dbl> <chr>      <chr>      <chr>       <chr>         <chr>         <chr>    
 1     1 Abhishanya https://w… statistics… 11/10/2021    11/10/2021    Informat…
 2     2 Abhishanya https://w… statistics… 11/11/2021    11/11/2021    Education
 3     3 Abhishanya https://w… Data scien… 11/11/2021    11/9/2021     Other    
 4     4 Abhishanya https://w… Data scien… 11/11/2021    10/13/2021    Other    
 5     5 Abhishanya https://w… Statistics… 11/11/2021    10/14/2021    Informat…
 6     6 Abhishanya https://w… Data Scien… 11/11/2021    10/14/2020    Health   
 7     7 Abhishanya https://c… Data engin… 11/11/2021    11/8/2021     Other    
 8     8 Abhishanya https://l… Data engin… 11/11/2021    11/9/2021     Other    
 9     9 Abhishanya https://w… Data engin… 11/11/2021    10/20/2021    Informat…
10    10 Abhishanya https://w… Data engin… 11/11/2021    10/21/2021    Finance  
# ℹ 372 more rows
# ℹ 108 more variables: Job_title <chr>, Company <chr>, Knowledge_in <chr>,
#   `Minimum Experience in Years` <dbl>, City <chr>, Location <chr>,
#   Educational_qualifications <chr>, `Payment Frequency` <chr>,
#   Currency <chr>, Salary <dbl>, `English Needed` <dbl>,
#   `English proficiency description` <chr>, Additional_languages <chr>,
#   AI <dbl>, `Natural_Language_Processing(NLP)` <dbl>, Data_Pipelines <dbl>, …

Preview of the tidy version of the dataset

head(DStidy)
# A tibble: 6 × 109
     ID Consultant DateRetrieved       DatePublished       Job_title     Company
  <dbl> <chr>      <dttm>              <dttm>              <chr>         <chr>  
1     1 Thiyanga   2020-08-05 00:00:00 NA                  <NA>          <NA>   
2     2 Jayani     2020-08-07 00:00:00 2020-07-31 00:00:00 Junior Data … Dialog…
3     3 Jayani     2020-08-07 00:00:00 2020-08-06 00:00:00 Engineer, An… London…
4     4 Jayani     2020-08-07 00:00:00 2020-07-24 00:00:00 CI-Statistic… E.D. B…
5     5 Jayani     2020-08-07 00:00:00 2020-07-24 00:00:00 DA-Data Anal… E.D. B…
6     6 Jayani     2020-08-07 00:00:00 2020-08-13 00:00:00 Data Scienti… Emirat…
# ℹ 103 more variables: R <dbl>, SAS <dbl>, SPSS <dbl>, Python <dbl>,
#   MAtlab <dbl>, Scala <dbl>, `C#` <dbl>, `MS Word` <dbl>, `Ms Excel` <dbl>,
#   `OLE/DB` <dbl>, `Ms Access` <dbl>, `Ms PowerPoint` <dbl>,
#   Spreadsheets <dbl>, Data_visualization <dbl>, Presentation_Skills <dbl>,
#   Communication <dbl>, BigData <dbl>, Data_warehouse <dbl>,
#   cloud_storage <dbl>, Google_Cloud <dbl>, AWS <dbl>, Machine_Learning <dbl>,
#   `Deep Learning` <dbl>, Computer_vision <dbl>, Java <dbl>, `C++` <dbl>, …

Preview of the untidy version of the dataset

head(DSraw)
# A tibble: 6 × 152
     ID Consultant DateRetrieved DatePublished Job_title     Company     R   SAS
  <dbl> <chr>      <chr>         <chr>         <chr>         <chr>   <dbl> <dbl>
1     1 Thiyanga   05/08/2020    <NA>          <NA>          <NA>        1     1
2     2 Jayani     07/08/2020    31/07/2020    Junior Data … Dialog…     1     0
3     3 Jayani     07/08/2020    06/08/20      Engineer, An… London…     0     0
4     4 Jayani     07/08/2020    24/07/2020    CI-Statistic… E.D. B…     1     1
5     5 Jayani     07/08/2020    24/07/2020    DA-Data Anal… E.D. B…     0     1
6     6 Jayani     07/08/2020    13/08/2020    Data Scienti… Emirat…     1     0
# ℹ 144 more variables: SPSS <dbl>, Python <dbl>, MAtlab <dbl>, Scala <dbl>,
#   `C#` <dbl>, `MS Word` <dbl>, `Ms Excel` <dbl>, `OLE/DB` <dbl>,
#   `Ms Access` <dbl>, `Ms PowerPoint` <dbl>, Spreadsheets <dbl>,
#   Data_visualization <dbl>, Presentation_Skills <dbl>, Communication <dbl>,
#   BigData <dbl>, Data_warehouse <dbl>, cloud_storage <dbl>,
#   Google_Cloud <dbl>, AWS <dbl>, Machine_Learning <dbl>,
#   `Deep Learning` <dbl>, Computer_vision <dbl>, Java <dbl>, `C++` <dbl>, …

Copy Link

Version

Install

install.packages('DSjobtracker')

Monthly Downloads

210

Version

2.0.0

License

CC BY 4.0

Issues

Pull Requests

Stars

Forks

Maintainer

Thiyanga S. Talagala

Last Published

December 9th, 2023

Functions in DSjobtracker (2.0.0)

get_data

Get data from DSjobtracker for specific years or all the years combined into one dataset
DStidy_2020

Data scientists, data Analyst, and statistician related job advertisements in 2020
DStidy

Data scientists, data analyst, and statistician job advertisements from 2020 to 2023
DSraw

Data Scientists/Data Analyst/ Statistician Job Tracker
DStidy_2021

Data Scientists/Data Analyst/ Statistician Job Advertisements in the year 2021