Bridge the Gap Between Data and Humanity With the Power of This Python Library

Author:Murphy | View: 29206 | Time: 2025-03-23 19:42:19

Introduction

We don't need to rely on any statistics to realize that Python is one of the most used programming languages by Software developers, Data Scientists, and more. This is not only because of its flexibility, and ease of use, but also the wide range of libraries out there that makes our day-to-day tasks easier.

This article introduces another powerful library: humanize.It helps bridge the gap between humans and Python outputs, by making them more understandable. Let's have a look at some illustrations.

Getting started

To be able to use humanize the first step is to install it using the Python package managerpip as follows:

!pip3 install humanize

Next, you need to import the following libraries relevant to successfully perform the tutorial.

getsize() from os library for getting the size of a given file.
datetime used to work with time.
And finally, the humanize library, which is the center of this article.

from os.path import getsize
import datetime as dt
import humanize as h

Everything set up to start the exploration, starting with Big numbers.

Make big numbers more readable

What is this number 1034503576643 ?

It requires some mental effort to effectively understand whether this number is in the range of a billion or a trillion. This is the type of burden that humanizer tries to soften by providing the user with a nicer output.

One way of doing it is separating it using the correct comma ',' and this is done using the intcomma function as follows:

big_num = 1034503576643

human_big_num_coma = h.intcomma(big_num)

print(human_big_num_coma)

The output of the code above is 1,034,503,576,643, which is way better that the original number without a separator.

Furthermore, the result can be generated in natural language format using the intword function as follows:

human_big_num = h.intword(big_num)

print(human_big_num)

This gives the following result: 1.0 trillion.

Working with DateTime

2022/9/6 (YYYY/MM/DD format) is Sep 06 2022

The second format (Sep 6 2022) is much easier for anyone to understand than the first YYYY/MM/DD format because it lies in our day-to-day verbal communication as human beings. Such a result can be obtained using the naturaldate function.

date = dt.date(2022, 9, 6)

human_date = h.naturaldate(date)

print(human_date)

This generates the following result: Sep 06 2022.

Instead of using naturaldate, the result can be limited to the month and the date using the naturalday function.

human_day = h.naturalday(date)

print(human_day)

The result is Sep 06

Working with duration

Similar to DateTime, it is also possible to make duration human-readable by using the naturaltime function as illustrated below.

# Get today's date
current_time = dt.datetime.now()

# Get the date of 3 days before
few_days_before = dt.timedelta(days=3, hours=23, minutes=40)

# Compute the difference of time
past_time = current_time - few_days_before
human_time = h.naturaltime(past_time)

print(human_time)

The previous code generates 3 days ago which anyone can understand.

Get the size and unit of files

My file size is 278.

The most obvious question from this statement is

What unit are you using? Bytes, KiloBytes, Megabytes, Gigabytes, Terabytes?

This mystery can be solved using the naturalsize function as shown below:

First, get the size of the CSV file using the getsize function.
Then the naturalsizefunction is used to generate a more appropriate output.

fize_size = getsize("./candidates.csv")

# Before Humanize
print(fize_size)

# After Humanize
print(h.naturalsize(fize_size))

The result before humanization is 278.
After humanization, we get 278 Bytes.

Scientific notation and fractions

The scientific notation of a given number can be more useful in some scenarios, like when using power of the ten notations. This can be achieved using the scientific function.

With the precision parameter, the user can specify the number of precision values to consider after the decimals. When not specified, the precision value is 2.

Below is an illustration.

# Number to convert to scientific format
value = 2304355

# Without Precision
scientic_notation = h.scientific(value)
print(scientic_notation)

# With precision
scientic_notation = h.scientific(value, precision = 5)
print(scientic_notation)

The outputs are given in the same order of the print statements.

Using the default function: 2.30 x 10⁶
With the precision parameter: 2.30436 x 10⁶

What do you think is the fractional representation of 0.4646?

Save yourself from too much mathematical computation and just use the fractionalfunction as follows:

float_value = 0.4646

# Get the fractional representation
fraction = h.fractional(float_value)

print(fraction)

The answer is 105/226. That's really cool, isn't it!

What if I am dealing with another language

All the previous results are in English. The same could be achieved in other languages such as French, Russian, and more.

The first step to reach that is to activate internationalization (i18n) feature using the i18n.activate function.

For instance, it is possible to create a time delta object with a duration of 3 seconds, but this time in French.

# Activate the French Language
_t = h.i18n.activate("fr")

# Generate the time delta
h.naturaltime(dt.timedelta(seconds=3))

The result is il y a 3 secondes which means 3 seconds ago in English.

Conclusion

Thank you for reading!

Tags: Coding Data Science Programming Python Technology

Add Fav

Comment

Murphy

Recommend

◦ What I'm Updating in My AI Ethics Class for 2025

◦ Python Could Know Your Holidays No Matter Which Country You Live

◦ Unlocking Hidden Potential: Exploring Second-Round Purchasers

◦ Counts Outlier Detector: Interpretable Outlier Detection

◦ Addressing Spatial Dependencies

◦ How to Run Bootstrap Analysis in BigQuery

◦ 2023 in Review: Recapping the Post-ChatGPT Era and What to Expect for 2024

◦ Your Vision-Language Model Might Be a Bag of Words

◦ The Generative AI Advantage: Product Strategies to Differentiate

◦ Building a Smart Travel Itinerary Suggester with LangChain, Google Maps API, and Gradio (Part 1)

◦ What Is a Latent Space?

◦ Model Selection with Imbalance Data: Only AUC may Not Save you