Privacy in today's age with a SOCKS proxy

Say you are at a cafe, and you want to surf the Web. But the WiFi is not secure. Or say your company lets you bring your laptop, but what if its firewall has blocked your favorite website? Is there no hope, besides paying $15 to a VPN provider?

There is, and it costs about $3.50 per month as of this writing.

[Read More]

Getting top-N elements in Spark

The documentation for pyspark top() function has this warning:

This method should only be used if the resulting array is expected to be small, as all the data is loaded into the driver’s memory.

This piqued my interest: why would you need to bring all the data to the driver, if all you need is a few top elements?

The answer is: it does not load all the data into the driver’s memory.

[Read More]
spark 

Accessing home computer from anywhere

Do you sometimes want to access your home computer from an outside network? Maybe you use another system, but you do not trust it and would prefer your home computer for some workflows?

This post outlines the steps to make such access possible.

[Read More]

Correct way to create a directory in Python

Can you see the problem with this code? It comes from Ansible, v2.1.1.0.

if not os.path.exists(value):
     os.makedirs(value, 0o700)

It’s quite straightforward. It checks if a directory path exists. If it does not, then it creates the directory path, similar to mkdir -p. What could be wrong?

[Read More]