Working as Data Scientists, we have to solve various problems on a daily basis. Some of them are tasks we've completed many times before and some of them are completely new to us and we have to understand the business logic behind them before starting building models. The latter was the case when our client, a bank, approached us a couple of months ago with a problem they were tackling – whether to provide or deny loans to gamblers.
In this month's blog post, we are going to share a case study based on a project we did for one of our clients – a Slovak bank.
By now, most financial institutions have been familiar with data analysis for some time. One use case for examples is credit scoring. Money lenders, such as banks and credit card companies, have been using it for a couple of decades now to evaluate the risk of lending money to consumers and to mitigate losses. However, the recent arrival of new technologies and the rise of machine learning and Data Science brought along many new opportunities.
Whether you are from a startup or a big corporation, every company today has one problem in common. At some point, there is a need to develop your own analytic capabilities so you can leverage your data efficiently. While outsourcing and getting help from consultants can be a great way to get things off the ground, eventually it does make sense to have the dedicated people in-house. And voila, you are searching for your first Data Scientist.
In this two part blog post, we will show you how to analyse time-series and how to forecast future values by Box-Jenkins methodology. As a testing dataset, we have chosen the "Monthly production of Gas in Australia". This dataset is available from datamarket for free. We have restricted data from the time span 1970 to 1995.
In this article, we look at 'the built environment', which Wikipedia describes as "human-made surroundings that provide the base for human activity, containing everything from parks, public transport or hospitals to coffees, bars, and restaurants." One way to do it is to explore the neighbourhoods through open data.
A 5% increase in customer retention produces more than a 25% increase in profit. It is cheaper to keep existing customers than gain new ones. Over the years, we have collected a lot of experience with churn prediction, from industries like telecommunication providers, banking or computer security. Today we want to share some of our experience with you.
It has been two years since we have started Knoyd. We have helped quite a few companies using their data more efficiently and have learnt a lot during the process. It has also been two years since we published one of our very first blog posts, analysing tweets about Star Wars: The Force Awakens. And since the new installment has hit theaters in December, we have decided to create a re-hash of this original blogpost.
In this post we will show you how to scrape reviews from an amazon product page. This data can be used to create datasets for sentiment analysis or other educational or research purposes. If you sell products on Amazon it can even be useful to analyse the reviews to understand what customers like and dislike about your product. Let's dive in!