2021

DATA 300: Introduction to Analytics

LinkedIn

In the past twelve weeks, I enjoyed teaching DATA 300, Introduction to Analytics, at Franklin University. The students came from very diverse backgrounds: Computer Science, Information Technology, Information System, Accounting, Business Administration, etc. This is not surprising as analytics has been involved in various roles in the industry. We discussed the fundamental analytics topics, including #datawrangling, Descriptive Analytics, #datavisualization, Statistical inference, #regressionanalysis, Simulations, and Optimizations. Most students showed strong learning motivation and many of them did great jobs! IMO the prompt out-of-class feedback is critical to productive online learning. I am glad that we chose McGraw Hill’s smart book in this redesigned course, which helps students understand the key concepts with reading questions and immediate feedback. Check it out if you are curious how does the smark book work: https://lnkd.in/ems7qGHf

Data Science in 2021

LinkedIn

The latest episode from “Talk Python to me” (https://lnkd.in/eKRpC-SC) is great. Stan Seibert from Anaconda talked about what’s happening in the data science space in 2021. Some interesting takeaways from the 2021 State of Data Science survey (https://lnkd.in/eKgEJtBV) are: 1. 50% of respondents said data science investment stayed the same or increased, meaning data roles remained significant throughout the pandemic. 2. Data roles fall within the organization everywhere. Most of the respondents’ roles fall under IT (23%) and Research & Development (16%). 3. 87% of respondents said their organizations allow the use of open-source software. 4. The top 3 analytics programming language are Python, SQL, and R.

Ed Discussion Tool

LinkedIn

Not surprisingly, I found students who asked more questions often had better grades. Asking questions works because it makes people active learners instead of passive recipients of knowledge. When people interact with information by elaborating on it and thinking critically about its context, they learn more productively.

Recently, I started using Ed Discussion in analytics courses at Franklin University. It is a fantastic tool that helps students ask questions more efficiently. Students even can ask anonymously or embedded the runnable code snippets. We did receive more questions :-). Check this quick start guide if you are interested in it: https://lnkd.in/eFVaYXnf

Correlation vs. causation

LinkedIn

Machine Learning/Data Mining is great at finding correlations in data, but not causation! How to make sure X causes Y then? (1) X must occur before Y (2) There must be evidence of an association between X and Y (3) Control other causal factors (confounders)

These are the basics we just discussed in the #businessanalytics course at Franklin University. The thing is, experimentation is the only way to assert that one action causes another to happen confidently.

In the above three requirements for causal statement, 3) is the most challenging one. For example, it is hard to tell there is no confounder when people claim a driver’s poor credit report causes a higher driving risk. Here are more interesting discussions about this hot real-world industry case FYI: https://lnkd.in/etzYZX5n

Product Analyst Role

LinkedIn

Product Analyst is an in-demand #businessanalytics role. When I was in the insurance industry, I saw how the product analysts amazingly improved the insurance products using their business acumen and analytics skills. If you are curious what a product analyst does, this interesting article from my morning reading can tell you what is it and what skills are essential to this job, with some resources to learn them: https://lnkd.in/eWbQP-3Z #personalgrowth #professionaldevelopment #education #analytics #FranklinMakesItPossible Franklin University

A/B Testing

LinkedIn

Just discussed A/B Testing with students last night in a Marketing Analytics course at Franklin University. Analytics professionals often need to compare data. Instead of eyeballing, a statistical method can tell us whether the difference is significant or not, which is way more reliable and professional. Here is an article that walks through the statistical significance when comparing data in A/B testing: https://lnkd.in/er5R9tH

In case Medium’s paywall blocks you, post the above link on Twitter, then the full article will be available.

BUSA 603: Marketing Analytics

LinkedIn

I am teaching BUSA 603, Marketing Mgmt & Analytics, this summer. One thing that surprised me was, although some students use SQL daily in their jobs, they heard SQL’s window functions for the first time. Window functions (also known as analytical functions) are a kind of aggregation. However, unlike regular aggregate functions, a window function does not cause rows to become grouped into a single output; instead, it performs a calculation across a set of rows that are somehow related to the current row. Thus, it is very convenient to get the analytics metrics like running total, moving average, row difference, etc. Please refer to the following great tutorial for more details: https://lnkd.in/erHUSHx

By the way, if Medium’s paywall blocks you, send the above link on Twitter to others or yourself, then the full article will be available via the link.

Free courses in LinkedIn Learning

LinkedIn

Do you know you can access thousands of amazing courses in LinkedIn Learning for free? Here is how it works using the Columbus Metropolitan Library’s account (contact your local Public Library if you are not living in Columbus, Ohio): 1. Visit https://lnkd.in/eqUaMTi 2. Click “Get started” and log in using your library card number and password. Done!

There are so many high-quality interactive courses. If you want to learn or sharpen your skills in #businessanalysis, you may start from Statistics Foundations, SQL for Statistics Essential Training, R Statistics Essential Training, and Python Statistics Essential Training.

Have fun!

Redesign DATA 610

LinkedIn

If you are an adult learner, you must know that balancing work, study, and life/family is challenging. Then, how to make the learning more productive? Because most adult learners already have working experience, as an educator, I think the most important thing is to build relevancy between the experience and the learning. For example, during the course design, I would ask myself how this course will offer working adults the skill sets they need to improve their work performance? In the past two months, to make the course more relevant, together with our great Instructional designer (Jesse Fuhrman) and course editor (Marivic Lesho), I redesigned DATA 610, a core course in Franklin University’s M.S. in Business Analytics program, using the textbook, Data Mining of Business Analytics (https://lnkd.in/eN7_TzQ). I added many real-world business analytics scenarios, and students will have more practice solving real-world business problems based on data. I am excited that the redesigned DATA 610 will start soon from next week and hope our adult learners can enjoy it!

AWS Cloud

LinkedIn

Want to learn what is AWS cloud? Join the AWSome Day Online (https://lnkd.in/euFQNaC), a free 3-hour virtual session on Mar. 31 that will introduce AWS services and resources, and how customers can benefit through innovation on AWS. You’ll learn about AWS Cloud concepts, core AWS services, and how these services enable innovation for a variety of new applications and industries. No prior AWS or cloud knowledge is needed.

DataFest

LinkedIn

Although many data science competitions are available nowadays, ASA DataFest, backed by The American Statistical Association (ASA), can benefit students more effectively because:

  1. It is an excellent opportunity to learn. Undergraduate students do the work, but they are mentored by consultants who are graduate students, faculty, and experienced industry professionals.

  2. Its topics are well selected and very practical. In the past few years, DataFest covered how to explore the impact of COVID-19, how to recommend what major to choose in college based on Indeed data, how visitor’s searches relate to the choices of hotels booked or not based on Expedia data.

  3. It helps students stand out in the competitive job market. Many professionals find ASA DataFest is a great recruiting opportunity.

Franklin University will host a two-day ASA DataFest in the coming April. Here is how to register: https://lnkd.in/eSPKg8i

Notion

LinkedIn

When I first heard about Notion, I thought it was another note-taking software like EndNote or OneNote. But soon, I found Notion is quite different. It is more like a Lego Sets of Web App. You can use it to build a webpage for blog, wiki, portfolio, project, class, etc., without any programming. Many powerful tools often have a steep learning curve, while Notion turns out to be very easy to learn and use. User experience is the top priority of this fast-growing startup in San Francisco, partially because its CEO and co-founder was an industrial designer.

Notion is very friendly for education. Its Pro version, which has unlimited file upload, is free for students and educators.

The digital portfolio helps students get the job. The following video shows how to build and share a portfolio using Notion in 15 min: https://lnkd.in/eyehgwJ

Domain Knowledge

LinkedIn

Besides math and programming, domain knowledge is the most critical skillset in the long term for analytics professionals. Want to get some feelings about domain knowledge? Quickly check out the following article and answer the questions inside: https://lnkd.in/ea-Zqr7

By the way, if the paywall blocks you, post the link on Twitter, and you will see the full article.

SQL Data Duplication

LinkedIn

In analytics projects using SQL, data duplication is a common mistake that leads to misleading conclusions. When two tables are joined, the left-side rows will be duplicate if the right-side matched variable is not unique. The following animation from Thomas Neitmann’s blog perfectly shows why is that: https://lnkd.in/eKiivZC

RStudio::global 2021

LinkedIn

rstudio::global (the former rstudio::conf) is one of my favorite conferences in the analytics field. It is enjoyable, informative, and the R community is very friendly. Talks from rstudio::global 2021 are now available at https://lnkd.in/gPNmWpW with subtitles in English, Spanish, and Chinese. Enjoy it!

Analytics Reporting

LinkedIn

In a Business or Data Analytics project, reporting is often the essential last step. If you are confused by so many reporting tools, I would suggest the R Markdown. It is easy to start, professional, and supports both R and Python! Check it out here: https://lnkd.in/eYt--tT

Key Data Science Concepts for Interview

LinkedIn

A good quick review of some key Data Science concepts that are often asked in the interview: https://lnkd.in/ezdmWKg

Dr. Vesselinov

LinkedIn

It is my great pleasure and honor to welcome Dr. Roumen Vesselinov to our analytics adjunct faculty team. He will teach the DATA 610, a big data analytics course, in this spring semester. Dr. Vesselinov holds PhD in Statistics and Econometrics and he has taught at several universities, including University of South Carolina, Queens College, NY, and The Johns Hopkins University. He has taught courses in Statistics, Econometrics, Macroeconomics, and data mining. Dr. Vesselinov has more than 40 peer-refereed publications and had participated in many research grant projects funded by the National Institutes of Health (NIH), Centers for Disease Control and Prevention (CDC), U.S. Department of Defense, Eurostat, MacArthur Foundation, etc. Dr. Vesselinov is a recipient of the Pamela W. Young Excellence Award from University of South Carolina, Fulbright scholar at Brown University and SAS certified professional. He has more than 1600 citations on Google Scholar.

Why R is the right tool

LinkedIn

Why R is the right tool to learn for Business Analytics professionals? - It better supports business-ready analytics and reports. - It is easy to learn thanks to tidyverse, a set of packages with a consistent programming interface.

Why not Python, then? When I was in the industry, I feel R is more productive in exploring business problems as it handles data manipulation, modeling, and reporting very well in one platform. Python, as a full-stack programming language, is well-suited for data scientists and engineers. I would suggest both, but do not learn them at the same time if you are new to programming.

See more details about why R is good for business here: https://lnkd.in/e_mBdwR

Essential SQL Concepts

LinkedIn

What is the most important programming skill in Business Analytics? Surprisingly, it is SQL instead of Python/R. SQL’s syntax is simple, but it can describe “what you want” from data intuitively and conveniently. Here is a great quick summary of SQL from the perspective of analytics and data science: https://lnkd.in/eEV2kfx

2020

Highlighted Analtyics Skills

LinkedIn

LinkedIn’s 2020 Emerging Jobs report (https://lnkd.in/eUdiXcW) shows that the top one is Artificial Intelligence Specialist with the highlighted skills for Machine Learning and Python.

At Franklin University, I designed a Python-based Machine Learning course for undergraduate students two years ago (COMP 411, next online section starts from 1/25/2021), and students who took this course love it. The major topics include Python review, classification, regression, clustering, dimensionality reduction, and natural language processing.

Check it out if you are CS/IT/Cyber students at Franklin University and are interested in the booming AI or Machine Learning Engineer roles.

AWS Big Data Event

LinkedIn

Big data is critical in Business Analytics. Amazon AWS is probably the most popular big data platform nowadays. This year, AWS reinvent event is free https://lnkd.in/eqv4Qcp (thanks to my colleague Bilge Karabacak for letting me know). Some topics would be very interesting: - Jupyter notebooks in Amazon SageMaker (Amazon’s online Jupyter notebook) - How Zynga modernized mobile analytics with Amazon Redshift RA3 (Amazon’s data warehouse) - How to choose the right instance type for ML inference (cloud computing for machine learning)

Confidence Interval

LinkedIn

When I discussed with our graduate assistant about a research project yesterday, I suggested adding the confidence interval (CI) to the descriptive analytics. The understanding of uncertainty is important in business analytics. Here are how to get the empirical CI practically: 1. Bootstrap the data n times (n is data-size dependent, usually 200 to 300). 2. Calculate the metrics for each data set. 3. Identify the 90% CI of metrics by their 5% and 95% percentiles. Reference the following post for a Python implementation: https://lnkd.in/g_hQRqZ

Suggestion to students: get hands dirty

LinkedIn

Besides the two suggestions I gave students, “continue being curious” and “be a problem solver”, the last one is to “get hands dirty”. Business Analytics is to get insights from the historical data and further drive business decisions. Thus, the understanding of data is critical. There is no way to get data ready without getting hands dirty. To this end, I would suggest having a broad exposure to data wrangling tools, including SQL, R, and Python, if possible. If I only pick one, it will be the SQL. Many new analytics professionals surprisingly found SQL is far more important than what they thought. I used SQL very often when I was a data scientist in the industry. Matt Bonakdarpour, the data science VP of Root Inc., often said, “SQL is the cash you pay to get what you want quickly.” When I discussed the M.S in Business Analytics program with our advisory board member, Kimberly Klenk, she also emphasized that SQL is an essential skill. If you are not confident about SQL, here is a good tutorial: https://lnkd.in/ex8r9XN. The cool thing is you can practice SQL in business analytics and see the results just using a web browser.

Data Cleaning

LinkedIn

There’s the joke that 80 percent of data science is cleaning the data and 20 percent is complaining about cleaning the data 😃… Here the big takeaway is, the data preparation is really very critical in the industry. The following framework is a great guideline.

Suggestion to students: be a problem solver

LinkedIn

Besides the “continue being curious,” my second suggestion to my students is “be a problem solver.” Instead of just monitoring and reporting data, business analytics professionals have more exposure to problem-solving. Thanks to the fast-growing analytics tools, business analytics professionals may not need to know many mathematical details like scientists and to write very complicated code like engineers. However, in my opinion, they do need to clearly understand the principle of data science to support analytical thinking. I highly recommend this book, Data Science for Business (https://lnkd.in/eGHeVkC), as it is easy to read, practical, and very friendly for the working adult learners.

Suggestion to students: continue being curious

LinkedIn

After finish teaching the course on big data analytics last week, what impressed me most is that the students in the Master of Science in Business Analytics, Data Analytics, and Health Informatics programs showed a strong motivation and curiosity to learn. At Franklin University, most students are working adults. Learning is challenging because they have to balance work, life, and study. However, they know what they want and are eager to pick up new knowledge, which is fantastic! In the industry, good analysts professionals often ask questions and enjoy learning something new. Thus, my first post-class suggestion to my students is “to continue being curious,” which will benefit the career development for sure, from an educator and former data scientist’s perspective.

Data Science Thinking: a case study

LinkedIn

A very interesting case study about how to solve a real-world business analytics problem using the “Business-driven” philosophy. https://towardsdatascience.com/problem-solving-as-data-scientist-a-case-study-49296d8cd7b7