Liwaiwai Liwaiwai
  • /
  • Artificial Intelligence
  • Machine Learning
  • Robotics
  • Engineering
    • Architecture
    • Design
    • Software
    • Hybrid Cloud
    • Data
  • Learning
  • About
  • /
  • Artificial Intelligence
  • Machine Learning
  • Robotics
  • Engineering
    • Architecture
    • Design
    • Software
    • Hybrid Cloud
    • Data
  • Learning
  • About
Liwaiwai Liwaiwai
  • /
  • Artificial Intelligence
  • Machine Learning
  • Robotics
  • Engineering
    • Architecture
    • Design
    • Software
    • Hybrid Cloud
    • Data
  • Learning
  • About
  • Data

The Weird Ones: How To Handle Outliers In Your Data

  • September 3, 2019
  • admin

Outliers in data are the weird ones in a set. Their values are way off the rest of the values of the sample. They can really ruin your analysis, especially if you are using methods which are sensitive to the presence of outliers.

Given this, a lot are inclined to remove these observations. While this may make things convenient, this approach may end up yield false claims.


Partner with liwaiwai.com
for your next big idea.
Let us know here.



From our partners:

CITI.IO :: Business. Institutions. Society. Global Political Economy.
CYBERPOGO.COM :: For the Arts, Sciences, and Technology.
DADAHACKS.COM :: Parenting For The Rest Of Us.
ZEDISTA.COM :: Entertainment. Sports. Culture. Escape.
TAKUMAKU.COM :: For The Hearth And Home.
ASTER.CLOUD :: From The Cloud And Beyond.
LIWAIWAI.COM :: Intelligence, Inside and Outside.
GLOBALCLOUDPLATFORMS.COM :: For The World's Computing Needs.
FIREGULAMAN.COM :: For The Fire In The Belly Of The Coder.
ASTERCASTER.COM :: Supra Astra. Beyond The Stars.
BARTDAY.COM :: Prosperity For Everyone.

How exactly do we deal with these troublemakers?

Understanding outliers

For starters, we have to identify why these values occur at the first place. Some candidates for inclusion are cases where the outliers are produced by human or measurement errors.

On the other hand, there are cases where these aberrations are just true observations from the data set.

Consider a data set containing the net worth of US citizens. The net worth of Bill Gates will then be no measurement error — it is simply an observation of a real yet rare event.

Given this, once you’ve identified potential outliers, check whether they are mere errors which you can omit or correct.

To keep or not to keep

Now, if the observation turns out to be an unusual yet true observation, you have to assess whether the retention or omission of the said data point will be beneficial for your analysis.

There really is no quick fix for outliers. It heavily depends on the context of your analysis as well as the needs of the problem at hand. However, here are some measures you might want to consider:

  • Assess the importance of the outlier.Some outliers are produced by events that are due to the peculiar conditions. For instance, a decline in a company’s stock value may be due to a controversy which we do not expect to happen regularly. In this case, the omission of the outlier may be reasonable.
Read More  PyCon 2019 | Statistical Profiling (and other fun with the sys module)

On the other hand, some extreme values are better left in the data set. For instance, a significantly high earthquake magnitude in a time series data should be retained since it could potentially occur again. This will also allow such a damaging event to be taken into account in the decision-making in which the analysis may be used for.

  • Consider data transformations.There are instances that the impact of the outlying value is negated or minimized to a negligible level by a proper transformation. Trying some out might just do the trick.
  • Consider reporting casesIf you are not sure whether the omission or retention is the way to go, you may also consider reporting both the cases where the outliers are retained and the case where the outliers are omitted. In this case, you retain the insights coming from both states. Doing this may also help deciding upon the omission or retention of the outlying values.

These are some soft guidelines you can consider. Again, these are NOT strict rules that you should follow all the time. Dealing with outliers is highly context-dependent. Data analysis is not a straight road, it is an art.

Weird is NOT wrong

To wrap things up, we see that outliers provide helpful insights that typical values may not provide. Therefore, we should not see these extremely different values as a nuisance.

Instead, we should examine why these values occur. Doing this will also give us the best way to deal with outliers. Let the data speak to you.

Removing the weirdos is not always the way to go. Trying to understand them might help you out more than you think.

Read More  Statistics For Dummies: Indexing and Subsetting In R [Part 1 of 2] : Vectors And Matrices

For enquiries, product placements, sponsorships, and collaborations, connect with us at [email protected]. We'd love to hear from you!

Our humans need coffee too! Your support is highly appreciated, thank you!

admin

Related Topics
  • Outliers
  • Statistics
  • Statistics For Dummies
You May Also Like
View Post
  • Artificial Intelligence
  • Data

Applying Generative AI To Product Design With BigQuery DataFrames

  • September 21, 2023
View Post
  • Data
  • Platforms

Microsoft And Oracle Expand Partnership To Deliver Oracle Database Services On Oracle Cloud Infrastructure In Microsoft Azure

  • September 14, 2023
View Post
  • Artificial Intelligence
  • Data
  • Platforms
  • Software Engineering
  • Technology

Combining AI With A Trusted Data Approach On IBM Power To Fuel Business Outcomes

  • September 11, 2023
View Post
  • Data
  • Learning

Resources to Take Your Charts From Bland to Beautiful

  • September 7, 2023
View Post
  • Artificial Intelligence
  • Data
  • Data Science
  • Platforms

Reimagine Data Analytics For The Era Of AI

  • August 30, 2023
View Post
  • Artificial Intelligence
  • Data
  • Machine Learning
  • Platforms

IBM Introduces ‘Watsonx Your Business’

  • August 28, 2023
Google Cloud Next 2023
View Post
  • Artificial Intelligence
  • Data
  • Engineering
  • Platforms

10 Must-Attend Sessions For Data Professionals At Google Cloud Next ‘23

  • August 23, 2023
View Post
  • Data
  • Machine Learning
  • Research
  • Technology

Using Machine Learning To Help ZSL & Network Rail Monitor And Improve Biodiversity Near British Railways

  • August 21, 2023
A Field Guide To A.I.
Navigate the complexities of Artificial Intelligence and unlock new perspectives in this must-have guide.
Now available in print and ebook.

charity-water



Stay Connected!
LATEST
  • OpenAI 1
    How We Interact With Information: The New Era Of Search
    • September 28, 2023
  • 2
    Bring AI To Looker With The Machine Learning Accelerator
    • September 28, 2023
  • 3
    3 Questions: A New PhD Program From The Center For Computational Science And Engineering
    • September 28, 2023
  • 4
    Microsoft And Mercy Collaborate To Empower Clinicians To Transform Patient Care With Generative AI
    • September 27, 2023
  • 5
    Canonical releases Charmed MLFlow
    • September 26, 2023
  • 6
    NASA’s Mars Rovers Could Inspire A More Ethical Future For AI
    • September 26, 2023
  • 7
    Oracle CloudWorld 2023: 6 Key Takeaways From The Big Annual Event
    • September 25, 2023
  • 8
    3 Ways AI Can Help Communities Adapt To Climate Change In Africa
    • September 25, 2023
  • Robotic Hand | Lights 9
    Nvidia H100 Tensor Core GPUs Come To Oracle Cloud
    • September 24, 2023
  • 10
    AI-Driven Tool Makes It Easy To Personalize 3D-Printable Models
    • September 22, 2023

about
About
Hello World!

We are liwaiwai.com. Created by programmers for programmers.

Our site aims to provide materials, guides, programming how-tos, and resources relating to artificial intelligence, machine learning and the likes.

We would like to hear from you.

If you have any questions, enquiries or would like to sponsor content, kindly reach out to us at:

[email protected]

Live long & prosper!
Most Popular
  • 1
    Huawei: Advancing a Flourishing AI Ecosystem Together
    • September 22, 2023
  • Coffee | Laptop | Notebook | Work 2
    First HP Work Relationship Index Shows Majority of People Worldwide Have an Unhealthy Relationship with Work
    • September 20, 2023
  • 3
    Huawei Connect 2023: Accelerating Intelligence For Shared Success
    • September 20, 2023
  • 4
    Applying Generative AI To Product Design With BigQuery DataFrames
    • September 21, 2023
  • 5
    Combining AI With A Trusted Data Approach On IBM Power To Fuel Business Outcomes
    • September 21, 2023
  • /
  • Artificial Intelligence
  • Explore
  • About
  • Contact Us

Input your search keywords and press Enter.