Bridge the Gap Between Data and Humanity With the Power of This Python Library
Introduction
We don't need to rely on any statistics to realize that Python is one of the most used programming languages by Software developers, Data Scientists, and more. This is not only because of its flexibility, and ease of use, but also the wide range of libraries out there that makes our day-to-day tasks easier.
This article introduces another powerful library: humanize.
It helps bridge the gap between humans and Python outputs, by making them more understandable. Let's have a look at some illustrations.
Getting started
To be able to use humanize
the first step is to install it using the Python package managerpip
as follows:
!pip3 install humanize
Next, you need to import the following libraries relevant to successfully perform the tutorial.
getsize()
fromos
library for getting the size of a given file.datetime
used to work with time.- And finally, the
humanize
library, which is the center of this article.
from os.path import getsize
import datetime as dt
import humanize as h
Everything set up to start the exploration, starting with Big numbers.
Make big numbers more readable
What is this number 1034503576643 ?
It requires some mental effort to effectively understand whether this number is in the range of a billion or a trillion. This is the type of burden that humanizer
tries to soften by providing the user with a nicer output.
One way of doing it is separating it using the correct comma ','
and this is done using the intcomma
function as follows:
big_num = 1034503576643
human_big_num_coma = h.intcomma(big_num)
print(human_big_num_coma)
The output of the code above is 1,034,503,576,643, which is way better that the original number without a separator.
Furthermore, the result can be generated in natural language format using the intword
function as follows:
human_big_num = h.intword(big_num)
print(human_big_num)
This gives the following result: 1.0 trillion.
Working with DateTime
2022/9/6 (YYYY/MM/DD format) is Sep 06 2022
The second format (Sep 6 2022) is much easier for anyone to understand than the first YYYY/MM/DD format because it lies in our day-to-day verbal communication as human beings. Such a result can be obtained using the naturaldate
function.
date = dt.date(2022, 9, 6)
human_date = h.naturaldate(date)
print(human_date)
This generates the following result: Sep 06 2022.
Instead of using naturaldate
, the result can be limited to the month and the date using the naturalday
function.
human_day = h.naturalday(date)
print(human_day)
The result is Sep 06
Working with duration
Similar to DateTime, it is also possible to make duration human-readable by using the naturaltime
function as illustrated below.
# Get today's date
current_time = dt.datetime.now()
# Get the date of 3 days before
few_days_before = dt.timedelta(days=3, hours=23, minutes=40)
# Compute the difference of time
past_time = current_time - few_days_before
human_time = h.naturaltime(past_time)
print(human_time)
The previous code generates 3 days ago
which anyone can understand.
Get the size and unit of files
My file size is 278.
The most obvious question from this statement is
What unit are you using? Bytes, KiloBytes, Megabytes, Gigabytes, Terabytes?
This mystery can be solved using the naturalsize
function as shown below:
- First, get the size of the
CSV
file using thegetsize
function. - Then the
naturalsize
function is used to generate a more appropriate output.
fize_size = getsize("./candidates.csv")
# Before Humanize
print(fize_size)
# After Humanize
print(h.naturalsize(fize_size))
- The result before humanization is 278.
- After humanization, we get 278 Bytes.
Scientific notation and fractions
The scientific notation of a given number can be more useful in some scenarios, like when using power of the ten
notations. This can be achieved using the scientific
function.
With the precision
parameter, the user can specify the number of precision values to consider after the decimals. When not specified, the precision value is 2.
Below is an illustration.
# Number to convert to scientific format
value = 2304355
# Without Precision
scientic_notation = h.scientific(value)
print(scientic_notation)
# With precision
scientic_notation = h.scientific(value, precision = 5)
print(scientic_notation)
The outputs are given in the same order of the print
statements.
- Using the default function: 2.30 x 10⁶
- With the
precision
parameter: 2.30436 x 10⁶
What do you think is the fractional representation of 0.4646?
Save yourself from too much mathematical computation and just use the fractional
function as follows:
float_value = 0.4646
# Get the fractional representation
fraction = h.fractional(float_value)
print(fraction)
The answer is 105/226. That's really cool, isn't it!
What if I am dealing with another language
All the previous results are in English. The same could be achieved in other languages such as French, Russian, and more.
The first step to reach that is to activate internationalization (i18n
) feature using the i18n.activate
function.
For instance, it is possible to create a time delta object with a duration of 3 seconds, but this time in French
.
# Activate the French Language
_t = h.i18n.activate("fr")
# Generate the time delta
h.naturaltime(dt.timedelta(seconds=3))
The result is il y a 3 secondes
which means 3 seconds ago
in English.
Conclusion
Thank you for reading!