Data Leaders Brief

data-science-dictionary feature-extraction

Lessons learned building natural language processing systems in health care

O'Reilly on Data

MARCH 7, 2019

Language understanding benefits from every part of the fast-improving ABC of software: AI (freely available deep learning libraries like PyText and language models like BERT ), big data (Hadoop, Spark, and Spark NLP ), and cloud (GPU's on demand and NLP-as-a-service from all the major cloud providers).

Deep Learning

Deep Learning Testing Machine Learning Modeling

What is data governance? Best practices for managing data assets

CIO Business Intelligence

MARCH 24, 2023

Data governance definition Data governance is a system for defining who within an organization has authority and control over data assets and how those data assets may be used. It encompasses the people, processes, and technologies required to manage and protect data assets.

Data Governance

Data Governance Management Metadata Data Quality

Join 42,000+

Insiders

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Webinars

Data Talks, CFOs Listen: Why Analytics Are Key To Better Spend Management

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Trending Sources

Addressing Irreproducibility in the Wild

Domino Data Lab

MAY 1, 2019

This Domino Data Science Field Note provides highlights and excerpted slides from Chloe Mawer ’s “ The Ingredients of a Reproducible Machine Learning Model ” talk at a recent WiMLDS meetup. Mawer is a Principal Data Scientist at Lineage Logistics as well as an Adjunct Lecturer at Northwestern University.

Machine Learning

Machine Learning Testing Data Science Modeling

Webinars

Data Talks, CFOs Listen: Why Analytics Are Key To Better Spend Management

Mastering Apache Airflow® 3.0: What’s New (and What’s Next) for Data Orchestration

MORE WEBINARS

Manual Feature Engineering

Domino Data Lab

AUGUST 20, 2019

Many thanks to AWP Pearson for the permission to excerpt “Manual Feature Engineering: Manipulating Data for Fun and Profit” from the book, Machine Learning with Python for Everyone by Mark E. Feature engineering is useful for data scientists when assessing tradeoff decisions regarding the impact of their ML models.

Testing

Testing Modeling Interactive Measurement

Leveraging user-generated social media content with text-mining examples

IBM Big Data Hub

AUGUST 28, 2023

With nearly 5 billion users worldwide—more than 60% of the global population —social media platforms have become a vast source of data that businesses can leverage for improved customer satisfaction, better marketing strategies and faster overall business growth. What is text mining? How does text mining work?

Data mining

Data mining Machine Learning Deep Learning Marketing

What is an open data lakehouse and why you should care?

IBM Big Data Hub

JANUARY 17, 2023

A data lakehouse is an emerging data management architecture that improves efficiency and converges data warehouse and data lake capabilities driven by a need to improve efficiency and obtain critical insights faster. Let’s start with why data lakehouses are becoming increasingly important.

Data Lake

Data Lake Metadata Data Warehouse Data Governance

How to Easily Understand Your Python Objects

Insight

JULY 23, 2019

I frequently run into this issue in my data science workflow with complex objects in libraries, like TensorFlow. kwonlydefaults is a dictionary with keyword-only arg default values. annotations is a dictionary specifying any type annotations. args contains the argument names. kwonlyargs lists names of keyword-only args.

Data Science

Data Science Testing IT Machine Learning

Building a Named Entity Recognition model using a BiLSTM-CRF network

Domino Data Lab

JULY 1, 2021

The model achieves relatively high accuracy and all data and code is freely available in the article. The process of statistical learning can automatically extract said rules from a training dataset. Data exploration and preparation. Now let’s calculate some statistics about the data. mentioned in unstructured text.

Modeling

Modeling Statistics Testing Metrics

Towards Predictive Accuracy: Tuning Hyperparameters and Pipelines

Domino Data Lab

AUGUST 26, 2019

Data scientists, machine learning (ML) researchers, and business stakeholders have a high-stakes investment in the predictive accuracy of models. Data scientists and researchers ascertain predictive accuracy of models using different techniques, methodologies, and settings, including model parameters and hyperparameters. Introduction.

Testing

Testing Modeling Machine Learning Metrics

Themes and Conferences per Pacoid, Episode 13

Domino Data Lab

OCTOBER 9, 2019

Paco Nathan’s latest article covers data practices from the National Oceanic and Atmospheric Administration (NOAA) Environment Data Management (EDM) workshop as well as updates from the AI Conference. Data Science meets Climate Science. Data veracity, data stewardship, and heros of data science.

Deep Learning

Deep Learning Metadata Machine Learning Data Science

Lessons learned building natural language processing systems in health care

What is data governance? Best practices for managing data assets

Webinars

Trending Sources

Addressing Irreproducibility in the Wild

Webinars

Manual Feature Engineering

Leveraging user-generated social media content with text-mining examples

What is an open data lakehouse and why you should care?

How to Easily Understand Your Python Objects

Building a Named Entity Recognition model using a BiLSTM-CRF network

Towards Predictive Accuracy: Tuning Hyperparameters and Pipelines

Themes and Conferences per Pacoid, Episode 13

Stay Connected