Relying on data science means investing in its foundation.

Marinka Voorhout
3 min readOct 4, 2020

Early adopters have used data science — nowadays captured in the umbrella term AI — for multiple purposes. It is used to make life easier (e.g. avoiding traffic jam or easy Google searches (for more examples see: The Manifest ) Which by itself can also have negative side affects (just google “Social Media Bubble”).

With the fast increase of availability and complexity of data, more and more organisations are looking for ways process this large volume of data. E.g. to monetize data & AI, improve operational excellence through e.g. chat-bots or simply trying to work with the current data tsunami. Whatever the goal, organisations focus their attention on trusting (a growing number of) data scientists to process data.

“…..a data scientist should be able to oversee the complete data & AI environment, including being able to interpret potential affects”.

Data science is based on algorithms. In essence nothing more than successive steps to obtain a defined purpose. Hence the reference within data science to a recipe. Algorithms deliver the added value and convenience that organisations (and often their customers) are looking for. Yet, a good algorithm takes into account which data is needed (and which is available!), to what extent the data can be used, how the quality of the data is defined by the process it supports, if the algorithm takes bias, privacy & ethical guidelines into account and the provenance of the data.

All of these elements need a good understanding of the processes, systems, extraction tools and any manual alterations of both data & AI which are the starting point of the creation of both data and algorithms. In other words, a data scientist should be able to oversee the complete data & AI environment, including being able to interpret potential affects.

Yet, given the fact that data science & AI is a relatively new domain, organisation recruit data scientist fresh from universities. In practice this has led to newly graduated data scientist that are — not very positively — surprised by the fact that the work is not as straight forward as they were used within university. Especially the extensive data prepping is considered as inferior.

Surprised data scientists however is not the issue here. The fact that organisations (and their customers) rely on expertise that should have a more extensive foundation, is at least worrisome. Lacking this foundation will lead to unwanted results, e.g. the well known case of automated rejections of candidates at a large multinational (see: blog Cranium), where very suitable candidates where rejected because they lacked the ‘correct’ hobbies on their resume.

With the maturing of the data science domain, additional guidelines, standards and legislation are evolving. Any data scientist should be aware of these. But also look beyond the mathematical algorithm and explore their surrounding, including how it affects data science and its objectives. Organisations should not simply rely on data science. If you’re responsible for data & AI within your organisation, be proactively involved. And make sure that data scientists evolve as much as the data science domain.

--

--

Marinka Voorhout

Data strategy & data monetization director. Currently @Philips, formerly @KPMG, @NAVARA, @Capgemni.