Is Google directory listing still an issue in 2015?

Way back in 2003 when I was a student there was a pretty famous website from a guy who had taken the time to go through Google and list out search strings that returned content that should be hidden away. The site was “Johny I Hack Stuff” (which can be found under a new alias of http://www.hackersforcharity.org/ghdb/) was shut down for a while but checking today it looks like it back but without updated content.

Whilst reading a pretty mundane looking “Life Pro Tips” article I came across a tip for searching google for free Android APK’s. The idea was  pretty simple in it’s design and I was curious as it reminded me of the content of GHDB (Google Hacker DataBase). I was curious what else I could stumble upon using that string as a starting point. This is 2015, 12 years after that initial list of search strings started to appear. Surely people have learnt since then and even the ones that were opened close to the GHDB heyday have been closed down? Well the answer is sadly no, no they have not.

Search Criteria

The crux of this is I want to avoid downloading anything actually illegal. I would rather this article wasn’t used to promote all the free content you can get on the internet without torrents or usenet. When I do come across anything illegal I will pull any file that won’t get me in trouble to check permissions and that is it. This can be .nfo, .txt or images. Basically anything created that I will not get an angry letter in the post for.

I have thought long and hard about if I should expose the private but public content. Well the crux of it is that most of the examples have had quite a few years to cover up the content. There is only one way to learn now and that is by example. I will do my best where possible to content each person and warn them when I publish. It will up to them to fix the hole before people go directly to their content.

PDF’s

 String used: -inurl:htm -inurl:html intitle:”index of” pdf

OK, we are going to start of with something rather innocent looking. In an ideal, law abiding society I should get back only manuals and public domain publications but this was not the case, as I am sure you can imagine.

Taking a the first google result it looks like we already run into a quite broad collection of ebooks books. I would say this is package that the user has downloaded. It looks like there is a link to go up a folder and this takes us to what looks like a bands website.

Thankfully there is a link to their facebook page so I will leave them a note on there telling them to fix this.

Below is a grab of their neat little collection. Quite well read bunch if they have consumed all of these.

google_index_of_#1

Going down the list a little bit there seems to be to someone’s user account on a college or university server. Back in the day it was common for sysadmins to be plug home folders directly into the apache making each users home folder available to apache and all those that have access to it.

This is where it starts to get embarrassing..  We can see this man is not only reading up on all things computer science and Linux but also is reading up on sex advice. I have to admit I had to look in the folder… but what is worst is his ‘Dick Guide’. Not sure this is something you want associated with a photo of yourself.

 

google_index_of_#2

 

As before we can travel up a folder here and this is where it gets interesting. So we know what college or uni this chap was at and now we find out more about him. We know part of his name from his username. Turns out we can even find a photo of the poor chap here (below).

google_index_#3 So not only from the name of the photo work out it is a chap called Pavlin but we see what his schedule was and photos of places visited. This is a stalkers wet dream.

Sadly it doesn’t stop there as from here you can get to the rest of the users on the system… not good.

Conclusion

As I go and do a bit more digging on more of the Google results I can find further examples of open personal documents and PDF stashes. I have to say, I feel bad for authors of those book. Not only do these people pirate their books but they probably don’t even do the decent act of reading them.

DCIM

 String used: -inurl:htm -inurl:html intitle:”index of” DCIM

Incase you are unaware, DCIM (Digital Camera Images) is a common folder name used by digital cameras and mobile phones for a folder of photos.

I was amazed by the first result for this search term which was called “phone backup“. Would someone back their phone up to a public accessible place?

Looking at the content however I quickly realised I had stumbled upon what seems to be some very intermit and personal photos (WARNING: NSFW).  This genuinely seems to be a dump of someone’s memory card from their Android Phone.

google_index_of_#4

We have one of their facebook profile photo:

IMG_9786135113259

Hmmm, that is sound advice. Maybe I should check my own sites for directory listing before I post this? Sorry Inge but that was a bad location to store your private phone photos. Again we can transverse up further and see more private photos and even what applications were stored on her phone as APK’s (Android Application Package files). I didn’t really enjoy going through this lady’s private photos so I didn’t go too far to see if there was any readable txt files or other information.

Let’s leave poor Inge alone and have a look at the next listing. Again this seems to be a backup from an Android device and again it contains personal photos from the users. This person is just filter mad, how about glue sticks… with filter?

FxCam_1323199000039

 

Conclusion

There is just too many hits for this result to go through them all. I was quite amazed by how many results I got back and how I can find examples of folders label ‘private’ that really are not and even family photos. I can’t contact all these people but really they need to find out some how.

I even changed my tact a little and altered the search to check for “100ANDRO” but this return far too many positive hits of people who had uploading and shared their phone contents publicly. There is a lot of porn and private images so for now I am going to leave this one. If you have ever back up one of these folders you might want to check the permissions on them!

MKV

Search Used: -inurl:htm -inurl:html intitle:”index of” mkv

I already know what type of content this is going to return. MKV is a file format for compressed video and audio that is often used for “HD” content. Most of the modern torrents will come with this as the file format of choice.

What I am curious about though is that we always hear of the music and film industry going out of their way to issue takedown notices on sites such as Youtube or Vivo. But how much effort and if so effect have these companies had on easily accessible content?

Results

I can’t say I am surprised but the first google result seems to be someone’s shared DLC folder. This is the download folder of another popular file sharing application but one that does not need you to share your collection as public downloading folder. I checked one .nfo file and it was accessible and I could read the contents so I assume that the movies are downloadable as well.

This person is pretty much laying out their crime, file by file and with accurate web logs of who and what IP address has downloading each file. This seems a much better and easier target for the MPAA to attack?

Going down the list there is a one or two sites that are spam that are designed to look like these accidental listed pages but instead take you off to one of those annoying download sites. Some of them were quite good in locking up your browser so you can’t get off them. At this point I was starting to wish I had done my searching from the comfort of a Virtual Machine rather than my main development computer.

Other Pages of Interest

Whilst I was investigating these open issues I came across other pages that cover the same issue. One of them are even more worrying then what my basic search turned up.

Shodan HQ

http://www.shodanhq.com/

This is a website that makes it easy to find connected devices like webcams, routers that have the default login credentials set. It is of interest because a lot fo these devices can be found using google. I will do a review on this later on but the content is easily accessible and worrying.

Johny I Hack Stuff

 http://www.hackersforcharity.org/ghdb/

The original list that was started way back in 2002/2003. It is worrying how many of these search string are still active or can easily be modified to make a modern attack vector. A great starting point to get more insight into the issues with Google URL Hacking.

How to block Google Indexing Parts of your Site

http://perishablepress.com/tell-google-to-not-index-certain-parts-of-your-page/

This is a pretty comprehensive guide on stopping google from indexing pages that you really don’t want the public to see. In reality most cases just putting a blank “index.html” file in every new folder will cover it. That and only allowing apache to see folders in your /var folder. But this covers all you need to know.

 

Earn Extra Money on the Loo

Most of us take some delight now and then that while on work we are cashing in on being paid to poop. It is a nice feeling of doing something one must and yet still earning while enclosed in that little cubicle.

What if I told you though that you could, now and then earn even more money while you are sat there? No this is not a pyramid scheme or crazy snapchat photos of your junk for cash thing.

Not very well known to most Android users but Google actually has an application called “Google Opinion Rewards” where now and then it will present you a quick three minute and mostly shorter always shorter survey, normally based on a visit to a store.

For the your effort you seem to get back between 15 and 80 cents in Google Play Credit. If you are lucky to get a few of these in the month you can rack up enough for a free movie rental. Not bad for sitting on ones ass now is it?

So if you don’t mind a little extra snooping from google and the odd question on data they already have. To get yourself a copy of the Google Customer Survey Installed now the click the Image below to take you to the Google Play Store.

Google Opinion Rewards
Click Here to take you to the Google Play Store to get the App

Top 10 Words in a Chapter of a Book

[TLDR]: Top Ten Words Solution

Best Solution: Github - Version Three NIO
Summary: Using NIO and reading character by character we can pull out words when we come across a non character
Big-O: O(n) - For every character in the file we do an action

Analysis

This week I had the pleasure of interviewing for a top tech company here in the Silicon Valley. The interview did not go well at all and after the interview looking back at my coding solution I can see exactly why.

The question I was asked by the interviewer for the programming challenge was:

Given a large string such as the chapter of a book identify and print out the top ten words based on length.

Below is my solution from the interview, admittedly cleaned up with comments and made to run. But it is the same solution and to be honest I can see now why I was not given the job. The app is slow and there is numerous potential pitfalls to the way it works:

Full code for this version can be found here. Plus below is the output from the initial version showing words found and running time.

From the off there is a few issues with this application which using our dictionary file we are not running into:

  • It doesn’t handle repeated spaces between words or other word terminations like periods, semicolons or new lines
  • We read in a line as a string then parse as a char array
  • The ArrayList to array of chars conversion is slow
  • We check for words that are smaller than the smallest word in the array despite no replacement being needed

Let’s take another stab at this fixing a few of these issues from the start. The main issue as I see it at the moment is the ArrayList and the lack of a short circuit so let’s see what bolting those changes in has on the execution time.

Again we will run the application 10 times and get a average execute time.

Awesome! with those small changes we have shaved off about 139.5 ms from the average execution time for the app.  Now let’s see if we can remove the need to have two loops. One for each line of the file and one for each character in the line.

As we getting the data a line at a time and then scanning over the line a character at a time it would be a lot better if we could just get the data as block and then scan through that as a collection of characters. We don’t want to read the file a character at a time as I think going back to the file on disk, reading it and then processing is going to be slow.

This is where NIO (Non blocking I/o) comes into play. We can read in a set buffer amount (hence work in memory restrictions) and then process that block.

Let’s have a look at the speed of the app once we apply the NIO change:

So let’s have a look how this performs!

Even better we are now down about 425ms from the original application! This is getting to a lot more sensible of a run time for the app.

I feel though that we might be able to get just a little more out of this and then we can start feeding the app some much larger files to process. During the interview I was asked if there was a hardware method I could use to make this faster. Of course the answer would be to use multiple threads to parse the buffer of characters. So let’s have a look at what speed that has.

Splitting the app into threads then we end up with three files. One to contain the top ten words logic, one to handle the buffer/thread and one that is the actual app that reads the file and starts the threads.

Unexpectedly though this is actually slower than our Version Three of the application. In this case it looks like the overhead of setting up and running the threads takes up so much time it makes the application slower.

I tried running a much larger file, 3.2mb vs the 2.5mb of the Dictionary file I was using before. As before with this file it turned out that I could not beat the performance of the streamlined Version three of the file.

In Summary

It turns out that the fastest approach to this problem is to read using NIO and then buffer chunks of the file. Then reading character by character pull out the words.

If asked about using threads then yes you can but you still have to pull out the chunk to give to the tread and the main delay here is reading the file and not time in the algorithm.

If you think you can make improvements then please do fork the GitHub project and go for it! – Link

The Android Easter Egg; A Hidden Game

Most Android users may know about this already but hidden away in some version (if not all) of the Android O/s is a small game.

To get access to the game / easter egg follow the steps below:

  1. Go to ‘Settings’
  2. Scroll to the bottom and click on ‘About Phone’
  3. Then scroll to the bottom again and this time keep taping quickly on ‘Android version’

To play the game, click on the lollipop and then slide off to a side. It can take one or two goes to get it right.

Controls: 

Hold finger down to go up

Screenshots

 

Android Easter Egg
Initial Easter Egg screen. Click on the lollipop and slide out to get access to the game

 

Android Easter Egg Game
In game screenshot

 

 

Have Fun!

Android Build Variations and Build Tools 19.1.0

For those of you using the following gradle script for automatically doing find and replace in your AndroidManifest files then the update to 19.1.0 is going to break your build.

Old Version

Original Ref

 

Never fear though, as just updating the build path is going to fix out this issue and let gradle know where to look for the manifests again.

New Version

 

Build Tool 19.1.0 and Cloudbees – Jenkins Issues

For those of you that have recently updated your projects in Android Studio 0.6.0 to use Build Tools 19.1.0 then this is for you.

The error you are going to be looking for is :

To fix this you will need to add a new  Execute shell step under Build  that calls the following to update the SDK.

This will download and update the tools, platform-tools and build tools to the required version and your project should continue as normal.

Also, don’t for get to check out the cloudbees example Android jenkins project :

https://mobile-examples.ci.cloudbees.com/job/Android/job/android-sdk-update/configure

Google-Music Or Spotify

Google Music ‘All Access’ has now been in the United Kingdom for a while. So if you haven’t yet made the choice between Google or Spotify for your streaming music hopefully this will help you decide.

Feature Comparison

  Google Music Spotify
Web Player Yes Yes
Standalone Player (PC app) No Yes
Device Support Currently Android Only. IOS client on the way. Can be used via the web client. IOS/Android/Mac/Windows/Linux
Offline Files Yes Yes
Mobile/Tablet App Yes Yes
Library Import Yes Local media only
Third Party Apps No Yes
Play List Sharing Google+ mainly Yes, lots of methods
Last Fm Support By plugin / other app Yes
Device Integration Google Only Third Party Supported
Number of offline clients 10 2 (Mobile/PC)
Cost Per Month u00a37.99 u00a39.99*

* Based on Spotify premium which has the same features as Google Music.

Personal Choice

Well, I switched over to Google music. For me the missing features in Google Music were not worth the extra £2.99 for Spotify Premium. I do hope though that Google does eventually make an API and let’s developers make extensions. But for me it is no deal breaker.

The GUI of Google Music is simple but effective. The same for the Android app as well. But both function very well and the browser version also works with HTML 5.

Google does need to get it’s act in order though. Spotify is leading the race in being a house hold name. It has been around for longer and it has now being bundled with services like mobile plans or cable plans. With this plan Spotify will alternately attract more customer than Goole All Access and with more customers be a better deal with the music labels. Google has the money but Spotify has the customer base.

Addition: Device Integration :

Another point that one of my friends reminded me about is that Spotify also has much greater support when it comes to hardware integration. There are companies like Sonas that offer speaker sets that contain a Spotify client. This can be controlled via a mobile device like a remote using the Spotify App. Another example of simple integration across devices.

Google started going this way with it’s Nexus Q but that was not met with any great support. They even started with the Chromecast and Google Tv’s but from a consumer point of view I would think that the simplicity of a speaker set with it ready to go out the box is a lot more tempting. This is music and people want to keep these things separate.

Wow-got-cha

The hidden truth of wowcher/groupon and similar sites.

These website offer users the option of buying products at value prices. The idea being that as so many people are buying you will benefit by getting a mass buying discount. As well as physical products these sites also offer cheaper services such as holidays, restaurant deals and days out.

But lets take the the restaurant part and imagine that you own a successful restaurant or a up coming joint that values its food. Then the need to just churn customers through the door is just not there.

You can see by now where I am going with this, it is basic economics. Good restaurants who have returning customs are unlikely to offer these packages.

The same logic can be applied to the holiday packages or day trips. Before you take any of the deals you have to ask your self. Why is this company offering this deal?

The best I thing to not get burnt is to go check the reviews. Find out what other people have thought of other services or offers from these places And remember if you are getting a holiday package to check you are ATOL covered (atol.org.uk) .

You should check out the hotels being offered on TripAdvisor. Plus if there is the option of flight and they mention availability then beware. This means that they will have a few good flights and the rest will most likely be the flights no other fool would take. So price it up yourselves, which airports and time for the price of the deal do you think they will use? You will most probably be on the mark.

As for products you buy make sure you know how much it normally costs and what quality you are getting. The best way to do this is simply Google it or go search for it on Amazon. If no one has heard of it you are taking a big gamble.

Lastly, if it is a service where you get a delivery service for cheaper than normal. Make sure that you are not signing up for repeat orders. The company will probably bank on 90% of the people not checking the fine print.

You can get some very good deals out of these sites. But if you do not know what it you are getting and blindly buy a deal then this is as good as gambling with your money!

Safe shopping all.

Get a Android Firmware Update With Out Waiting

Things like this I normally take a pinch of salt, right next to putting your hard drive in the freezer (no it does not help). But this actually worked first time :

http://www.pcadvisor.co.uk/how-to/google-android/3461225/get-android-43/

So, why wait for your Jelly Bean, 4.3 update?

EDIT::

Turns out that this method does break you access to the Play store. To restore your access remove the user under settings and then add it again.