Lance P. Hawk, Manager of Computer Forensics and Investigations at ‘Air Products and Chemicals, Inc.’, takes the floor at InfoSec World conference to deliver an instructive presentation on how in-depth forensic analysis and tracking can be conducted using a variety of web-based techniques and tools.
I think we are ready to begin our session “Using the Internet as an Investigative Tool”. This is not meant to be a technical session. It’s more to tell you the various sites that I use on a daily basis. My name is Lance Hawk. I manage computer forensics and investigations.
I have been doing that for 24 years. I also help out various levels of law enforcement, from local states all the way to the federal, because I actually grew up with forensics. Another responsibility I have is cyber risk mitigation and response, which mostly has to do with attacks and stuff like that.
This is a presentation that I give that actually grows based on input from the audience. First thing: general process and tools to document findings. We are going to talk about everything from screen grabbers and screen captures – to web mirrors and different things you can use to document your investigation.
General search engines – in this presentation I’ll only talk about two search engines: Google and Bing, I leave out Yahoo. And you may ask: “Why do you leave that out?” Yes, I realize it indexes 4 Billion+ pages, but I seem to get more of a difference between the other two search engines. So this isn’t that I am not endorsing, you know: “Don’t use Yahoo”, you may use whatever you are comfortable with.
Meta search engines – two big ones out there, we will talk about them. A lot of people don’t know the difference between when you use a general search engine versus a meta search engine, we will get into that.
Translating texts and web pages – I’ll talk about two different websites that can do that fairly well. I say ‘fairly well’ because there is a concept called ‘trust but verify’ in everything we talk about. Everybody knows what that means. I mean, if you are translating something from simplified Chinese to English, you know, it might not always sound so good, so you got to verify, just like with some search engine results.
We are going to talk a lot about search engines that I love, but still, just like in the last session the speaker was talking about verifying it with another source, when you use search engines you have to verify it with another source. You really do because sometimes search engines only update themselves once every six months. I know one that doesn’t do it for two years. So depending on how old that data is, you don’t wanna get wrong results.
Search engines for the blogs and social investigations – when we get to this, I actually wanna add one aspect today in regards to recent caseloads, so I have an idea of what’s going on.
Air Products and Chemicals is a gases and chemicals company, a ‘Fortune 500’ Company, I’d like we’re the top 100 – we are not, we are not the top 200. And I guess it’s 300. We are pretty close to that, but close to 20,000 employees worldwide.
So when I talk about the investigations, I’m gonna also talk about the international impact of a lot of these reviews. We will actually discuss a Chinese search engine, so we can get into it and understand the difference, you know, when you use one versus another, versus another.
Next, searching wikis and tweets – tweeting is actually interesting. A survey just came out saying there’s been a 20% increase in crime just using tweeting. In fact, they said it used to be 1.6%, now it is 2% of people. You know, 2% of tweets in a survey that was done are somehow and someway committing crimes. And that was done with a review of 10000+ tweets.
Searching for people, searching for email – I also will talk about reverse address lookups. With reverse address lookups, a lot of them are already built into some of the tools we’re gonna be talking about here.
Miscellaneous searching, the IP address searching – that’s geolocation that has brought a whole new round of forensics and searching.
And there is a great freebie out there we will talk about, and my model – if it’s free, it’s for me, as long as you can trust it, yet verify it. And there’s some other great ones that actually go into the history of registration of IP addresses which never existed before.
Searching for risks to your corporate reputation – one you probably know of; there are two that actually, if you feel bad one day, you might take a look at: you submit your corporate name and see what people say about your company, it’s amazing.
And finally, I will talk about accessing archived web pages.
Okay, this is what I call my ‘CYBB’, the ‘Cover Your Big Butt’ disclaimer (see image). The information is presented ‘as is’, we’re gonna talk about tools that can be used not just on the good side but the bad side; the same thing with some of the websites discussed. It’s assumed you have the appropriate authority to use such tools and your company supports such tools. Any websites, processes, tools used will be done at your own risk. And the opinions presented herein are not from ‘Air Products’, they are from me, and they don’t reflect ‘Air Products’ or any other agency I’ve done work with.
Now, why use the Internet as an Investigative tool? We all know: corporate, civil and criminal investigations – that’s the majority of my work. You can also add personal in there as there is a lot of personal consultation done.
Intellectual asset protection, I didn’t use to do much of these but that’s becoming more and more predominant, especially oversees for me, including protecting your ‘brand’. You know, how can you do that, what kind of websites can you use?
Litigation support, eDiscovery1 – eDiscovery and computer forensics go hand in hand.
Competitive intelligence – it’s amazing that competitive intelligence is built into some search engines.
Then I’ll talk about compliance, and I really like this quote, just came out in a survey: “95% of workers use IT resources for personal reasons at work, yet 40% have no social networking guidelines”. That says a lot, and that’s Unisys study.
What should you use to search? This is the first question to establish who the true geek in the room is. You have to be very careful. You wanna do your searching in a covert manner as much as possible. And generally you wanna use a dedicated independent resource. I have machine specifically I’ve set aside for this. It has nothing but this Internet searching, it has nothing but that. And not only do I have it set aside, but it goes without saying you have all the up-to-date virus and malware protection.
In fact, that machine – we have a normal corporate what is called the Gold Load – it has none of that on, that’s totally separate from my corporation. And it never gets plug in to the company Internet or Intranet. It’s used specifically for searching.
I suggest covert searching, ‘Sam Spade’ is a good one. Is anybody in here familiar with ‘Tor’? Glad to see that, glad to see more and more people are getting familiar with that.
‘Tor’ is actually what was used when all the bedlam over in Egypt and China, and everything else was coming up. It is the slickest way to avoid detection that’s out there. Now this is where the ‘CYBB’ disclaimer comes in because it is used for good as well as bad. What it does is it employs something called the ‘Onion Routing’, where basically I’ll connect to a server, which will connect to a server, which will connect to a server, it’s like a Verizon commercial, and so on and so on. And you can’t track back through multiple hubs. And I know various governments have tried, I know the U.S. government put some credence into that, too.
But I also use it just from a searching perspective because we do a lot of work over in the Middle East, obviously. And if you think about what gases and chemicals can be used for, and a lot of times people steal our product, the government is worried what our products could be used for, just think about it.
You wanna make sure that if you are investigating, say, some place over in Iran, and you are looking at their website – to make sure they don’t know you are looking, Tor is like the best thing out there for that.
Multiple browsers. I use multiple browsers for a variety of reasons. You do not want stick with one, I mean Firefox, IE, whatever you like. I mean they are all out there, but never just keep yourself to one, because I am amazed at the amount of times using one versus another, or one is down, one is not – so multiple browsers.
The other thing you wanna consider is multiple email addresses. You wanna set up a bunch of dummy emails. I have at last count, like, 61-62 bogus email addresses. And I have them with a variety of different places, not just the typical Hotmail, Gmail, but I have ones set up in different countries where we do business. I can actually go and use those emails yet together with Tor, so I am giving a double layer of protection. You generally do not want things tracked back to you.
Tools to Document Findings
Okay, what are the general processes here? Well, there’s probably a couple of tools you should have. First off, you’ve probably heard the terms ‘screen scraper’, ‘screen grabber’, something like that. The one I like, pretty slick, and it’s free for me – is FastStone Capture. With this tool, you can actually take it right out to a Word document, Power Point; you can take video files out. You can basically use a hotkey combination, it’s called up, and when you see something you like – press a key and it will record this, press a key and it will record that.
There are several others out there: Snagit, Hyperionics, CamStudio. I’ve just used FastStone Capture, it’s been out there quite a while.
Now, if you ever capture something and it looks like it’s going to litigation, the second rule of forensics, after acquisition, is what’s called ‘authentication’. How can you prove if I capture the graphic from this gentleman’s company, and he has a website up? How do I show that wasn’t modified by the time I captured it to the time it’s produced as evidence in court? And you use something called authentication for that. There is a nice slick little program that does all the levels of authentication called HashCalc (see screenshot).
So if I were to actually run, say, that graphic that I scraped off this website, I’d run it through HashCalc. It gives you something like an algorithm, which is just a mathematical series, so you note that down. And when you talk about it when you go to court, you can run something against it and say: “See, I can prove that, I captured it on this date, and from that date forward it wasn’t altered”. The same thing done with computer forensics, so you have to use the same type of process.
I’ll now talk about one of the things I do, especially for oversees. Well, you’re gonna capture whatever it is, but somehow you need some kind of an indexing system. The indexing system we use is actually dtSearch. dtSearch is the indexing system used by FTK1, if anybody is forensics examiner here. Now, for 2000 bucks, what I’ve done is I deployed dtSearch around the world in different locations. And if we make an acquisition oversees, then what we’ll do is we will feed it through this tool called dtSearch, so it’ll index all the emails, it’ll index all the documents, it’ll index the PowerPoints – all of that stuff. An alternative to dtSearch, what a lot of people like is something called Copernic, and it actually does have a free release for home use only.
One other tool I debated on putting up here has to do with capturing websites content. And I use the tool called BlackWidow (see image). It does a great job of going behind the scenes. I use this a lot in case we have an investigation somewhere in the Middle East – we are selling our product and they actually have our graphic files. So I use BlackWidow to rip their website and pull out all the JPEGs and all the documents. Therefore I am able to do a compare between what they have, what we have, and if it’s an exact match in every way, shape or form. So you might consider a tool like that. There are some limitations depending on the coding and everything else, but I think you need some of these tools to at least start.
Using Google and Google Services
Okay, now let’s actually get into another important thing. This changed recently, so this is actually relatively new. There used to be a ‘Preferences’ button for Google, but you now see this ‘little wheel’ (see image). And in the wheel, the number one mistake I think people make when they do searching, especially with Google or Bing, or something like that – is keeping the default value. That’s an important point, that’s generally set to use moderate filtering. Now, that’s good, and it’s bad in a way. It’s good because it screens out a lot of explicit crap. And it’s also bad because, once again, it screens out a lot of explicit crap – that’s a technical term, IT term, I still maintain.
You got be careful here because – I don’t know your place of business, whether they have some kind of rule set – this isn’t supposed to be, you know, a crap protector where that’s supposed to be On or Off. Most businesses say they want some filtering done because they don’t want somebody to do some search on the word ‘Titanic’ one time. You know, do a search on ‘Titanic’ and see what might come up with ‘Moderate’ filtering versus ‘Do Not Filter’.
At home, yes, I always recommend possibly even almost up to strict filtering. But a lot of times, probably once every 2-3 months, I get a call where an auditor has done something wrong. And they say: “Could you please look, I can believe we didn’t find anything?” – I say: “Did you change your filter settings?”. They say: “Oh, let me know where filter settings are”. A lot of people don’t know that, so one good point.
And Google has got 10 Billion+ web pages, approximately 18% of what’s out there, which is impressive. So you wanna set your preferences.
Now, cache is king. Cache has saved my butt in many investigations. People find out you are investigating them, looking into something. What are they gonna do? They’re gonna change their website, if it’s a website, if it’s a website investigation that you’re doing. Well, this little line here – ‘Cached’; think of it as a backup, almost an online backup. You go to ‘Cached’, and the site shows up a lot faster, and it’ll be basically the latest backup of whatever that website is. ‘Cached’ is great, especially if there is congestion, everything else – you wanna go to ‘Cached’.
Another good thing, you don’t use it too much and a lot people don’t realize its importance, is ‘Similar’. Anybody know what ‘Similar’ means? Okay, it’s good to find out maybe your competition. For example, ‘Air Products’ makes the product called ‘Surfynol 104’, and we had people who were stealing that. I could actually go and do a search on ‘Surfynol 104’, look ‘Similar’, and it will give you, like, similar, your competitors’ opposing products. If I would click that there, it would give ‘Air Products’ competitors – you know, people who sell something similar. It’s just good to know about.
‘Google Alerts’ – hopefully people are using them. If not, they are great to use. You basically set them up for a variety if types, whether it’s news, it actually could be a separate blog, it could be a separate website, video group, whatever. And the one thing I would recommend though, you don’t want it ‘as-it-happens’ because sometimes you get hammered with stuff coming at you. I mean once a day, once a week probably should be fine. And then ‘Email Length’ could be, you know, 20 results – up to 50 results. I have it set for 50 but, you know, it can be like ‘Air products and Chemicals’ – if that comes up in the news, I wanna know about it several times a day.
Among one of the best blog searches is the ‘Google Blog Search’. What’s nice about that is that it’s not restricted to the Blogger blogs. Anything that publishes what’s known as a site feed, Google will basically capture. And once again, you have that ‘More’ coming off Google to get to some of this stuff.
If you do any investigation, you are going to see almost everything I talk about today: case in point, current investigation where we are working on somebody who has done something bad overseas, you know, dealing with the email, dealing with misappropriation of company assets, all of that stuff. Well, Internet history is very important: whether they’re in, what they are doing. So I’ll go through the whole thing from Google to the meta search engines, to the blogs, to the tweet searches – and you’ll see actually the process here.
The con with ‘Google Blog Search’ is that most progressive places publish their site feed, but if they don’t – Google won’t cover it.
The ‘Advanced Blog Search’ (see screenshot) gives you a lot more capability. You can customize your search, put exclusions in, as well as dates. Dates become very important when it comes to investigation. So the capability to restrict the date is really fantastic – unfortunately, you don’t see that too much in other blog searching.
Using Google Advanced Operators
You can use the multiplication sign (*) to indicate any one word or symbol. You see here Apples, then the multiplication symbol, then Oranges – so that’s ‘Apples and Oranges’.
Use the plus sign (+) to search for a common word. Like ‘Air Products’ and then the word ‘and’, you know, you would use the plus sign. Use minus sign (–) to exclude.
If you wanna search for Lance Hawk, and Lance Hawk is all you want – put the quotes around it (” “), and people just don’t do it. Also, it actually does have a wildcard capability, which is just the Period (.).
Now, Google advanced operators – I highly recommend this (see image). If you do a search on that, there is a gentleman by the name Johnny Long I don’t know if people here know of, but he is a great guy. He is the father probably of Google hacking. And there is a lot you can do with this. I use Google advanced operators a lot, especially when I am searching websites and I am looking for just, say, Excel files or just documents. Just by knowing how to use a few operators, you can really fine tune a search. This is very high level, and this could be a whole session just in itself.
Okay, ‘site:’ operator is to be used if you wanna restrict the search to a specific site. A lot of times I am interested in somebody who possibly stole a chemical from us, and I think it’s, say, ‘ACME’ company, I can restrict it to just ‘ACME’ company with just the ‘site:’ operator.
‘filetype:’ operator – this is probably what I use more then anything, searching for PDFs, or if I am just searching for JPEGs, or just searching for TIFF files, or something like that, so use the ‘filetype:’ operator.
‘link:’ operator is used if you wanna search within the hyperlinks for a specific term. ‘cache:’ – I use it once in a while. ‘intitle:’ – again, I use that more when I have the title of a document that I think might have made its way out, so I put the document name with ‘intitle:’. And ‘inurl:’ – just as it says.
Once again, Johnny Long actually has free white paper you can download, and he has a couple of books out.
Using Bing and Meta Search Engines
Let’s now talk about Bing – again, we get into this issue, which you have to be concerned about. Maybe the big deference is more the explicit stuff, so if you are sure your investigation won’t turn anything like that up, you can probably stick with moderate filtering. But the worst thing you can probably do in any investigation is not work with the body of evidence that you should be working with.
So Bing has the same type of deal, it’s under ‘Preferences’, it goes to ‘Moderate’, and make sure you change it if you want.
There are search engines, and there’s something called meta search engines: what is the difference? The Grandfather of meta search engines is Dogpile. Why would you use one versus another? If you just have no idea what you are really looking for, you wanna do a general search, like on the word ‘chemicals’. If I’m gonna do a general search on the word ‘chemicals’, I’ll use a meta search engine.
Now, what this meta search engine will do, depending on who you choose. Dogpile assimilates Google, Yahoo, Bing, Yandex, and several more – the big players there, obviously. So if I am searching for the word ‘chemicals’, that’s gonna give me results back from each of those engines. So it might be some stuff I would not have gotten from Google.
On the other hand, if I am searching for something specific, like ‘Surfynol 104’, which is the product we make, that’s when you use like a Google, or Yahoo, or Bing type of deal, because this is like a scattergun. A meta search engine is like a scattergun, it will give you a bunch of hits from a bunch of different places.
So while looking through the stuff you might see under the word ‘chemicals’, maybe you’ll see something you are interested in that you want to search on or investigate further – then you bring that to the regular search engines, as long as you are sure of the difference between the two – it’s very important.
My friends in the federal community like Clusty. And when I started using it, it does return some different things, you see it covers different things. Especially there is a newer one – Gigablast, which seems to be getting a lot of traction, which you might think of as, like, a small Google.
Again, preferences – you know, Dogpile is gonna have the same kind of deal, you can have filtering issues. In a lot of these you’re gonna be encountering issues with filtering.
Online Language Translation Tools
Basically, two different websites have a pretty good idea of translating texts and web pages, they’ve been out there for a while. Google has its own (see image). You’re gonna need to say what you going from: from what language to what language.
And this is pretty clear, except when you get for instance China. We are pretty big in China, and there is a variety of different variants: there is simplified Chinese, this Chinese, that Chinese, you know. How do you know? What you have to do is go through different Chinese if you don’t know exactly what language it’s going to or coming from. And will actually see a difference between the translations. And these translations, I am warning you now, will be a little off. If it looks like it is something you’re gonna make some conclusion out of, I would have then somebody verify what’s there.
Okay, you set what to translate, and what you are going from and to. There is, if you want it, a specific website. But this does not work as it could. It’s more for searching on and translating text down there. And what I actually do is I use something like this, then with specific language search engine, which in the case of Baidu coming up we’ll talk about shortly.
So we have Google, who else is their competition? Yahoo – same type of deal: you can translate Chinese simplified to English. And again, if you have a web page – same type of deal. I’d say more people use Google than Yahoo, but they are both good.
Now the only reason I have this (see image) is, if you are a multinational company, you got to keep in mind – it sounds like a simple point but it’s actually a good point – that there are different databases and different regions that Google and Yahoo, and everybody else doesn’t have. I mean we do a lot of work in Africa – well, they have specific databases for Africa. We do work in the UK, guess what, you would think all the UK stuff is in Google and Yahoo – it’s not. Same thing with China, same thing with Malaysia.
So the point here is to keep in mind you can have the English equivalent depending on what type of investigation you are doing. If I am doing any investigation of China for instance, I know I’ll probably do a search with Baidu. I’ll be looking for something called QQ, and I’ll be looking for something called 123, which are some big things dealing with China. You’re gonna have the same type of deal with whatever region you are in.
If you are a U.S. company, or you are doing U.S. based investigation, this won’t matter to you. But the point here is to remember if you are doing something that deals with a foreign country, or a country outside the U.S., that there might be another database you should be searching. And Baidu is probably the top one I work with when it comes to China: same type of deal, almost looks like Google.
You can submit that web page to one of the two engines I just showed you earlier, and that will give you its best guess of whatever translation. It’s like if I know I am searching on Lance Hawk, a non-Chinese name, first thing I do is translate Lance Hawk to simplified Chinese. Then I’ll patch it in here, then I’ll work it in reverse. Whatever it returns – I’ll reverse it.
Searching blogs – we get to my favorite search engine of all. It’s not Google either. It’s the second one – Spokeo, the strongest one out there, I have been using it for years. I mean it’s the first place I turn to. Well, Technorati also does Myspace and Twitter. Icerocket – again, Myspace and Twitter. So I’ll actually go through Spokeo, Technorati, and Icerocket. And the last three – probably the only one out of those three I used is Bloglines, and it seemed pretty good to me with name searching. Now I will go through each of those individually.
Spokeo is what’s called an aggregator website (see image). What does that mean? It basically combines a bunch of different functions into one thing. Say, I am searching for a name. If you do Lance Hawk, and you’re gonna come up with a ton of Lance Hawks, you’re gonna wanna refine it first. If I knew Lance Hawk was from Pennsylvania, I could do ‘Lance Hawk, PA’, and I’ll restrict the results just to PA. If I knew Lance Hawk was from a specific town in PA, then I could do it even further. But you don’t wanna get too restrictive, just in case you think Lance lives in one town, but what if he lives one town over?
So usually you start a little higher and then fine tune yourself down. If you get returned, it’ll say, like: “I found 100 results” – well, 100 is actually is not bad, but I actually got ones that returned thousands of results, and you can be there all day. So the more information you have about whoever you are looking into upfront, the better off it’ll be.
Now the bad part: it does free general searching, and you will see what it is in a minute. But there is a small cost for that. And this is 3 bucks. That’s about 3 bucks a month. But why I really like it is this: look at the social networks it goes through (see screenshot).
Everybody always says: “Does that mean I can look at my son’s or daughter’s Myspace, or Facebook, or all this other stuff?” It all depends on how they have their settings. Basically, what will be indexed is the non-proprietary stuff, or something where they set a privacy option. And what I actually recommend to any parent, to anyone out here is – look yourself up, look yourself up with your name. Spokeo also has my user ID. And you can do stuff just by user IDs. Some of this stuff is amazing. I am finding more and more things on sites like Pandora and some of these other sites, also on Flickr, there is a lot of stuff. Some are very big sites, and this goes through them all. I like it. But again, trust but verify.
And that gets technical enough, that gives you like the geolocation of a house, and you see a picture of the house. This will pop up the address right away, shows the house and the block. It will talk about the wealth status – like, within 100,000 USD to 120,000 USD. It’s amazing, the best one.
One thing, there have been a couple of court cases that places, especially like Spokeo, are giving out too much information. So everybody has to give an opt-out for somebody. So if you do not want to be included on something like Spokeo, basically if you go to Spokeo, under the disclaimer it says: “If you do not want to be included, click here and submit this”. It might be a good idea to do that. That’s good to do. And on a lot of these sites there’s been a lot of pressure.
And one other thing, the one I won’t be talking much about today but I think presents a great example is LinkedIn. Previously, places like Google and stuff like that would include some LinkedIn content. But now what they’ve done, and I think they’ve set a bar, is they offer use three levels of searching that they now charge for, and they consider their content proprietary. So what’s normally indexed to search engines you cannot reach, and LinkedIn will charge you anywhere up to 50-60 dollars a month for the capability to search. Now they have a 20-30 dollar option, which is almost the same as the 60, there are some differences there though. Don’t forget something like LinkedIn and keep in mind that if you are searching certain sites, they start to get proprietary and do stuff like that.
Okay, let’s now look into Technorati (see image). I tried to find updated numbers but they weren’t published. After I use Spokeo I basically go to this. Same type of deal: this actually looks at tags that authors placed on their websites. That’s exactly what it kinda looks like. What’s interesting is it will always show you like the top 100, and these different things that are going on. I use it more for the actual blog searching.
And we talked about it earlier, I mean, why even worry about blogs, your stuff is not up there. It’s just the brief reviews we do, and we used to do them on annual basis, then it was semi-annual, then it was quarterly, now monthly. Well, we actually go out, and another thing you can do is put your own domain name in some of these places, and then use wildcards. See where your information is appearing, and I guarantee you will be shocked of what’s out there and what gets out there. We at ‘Air Products’ put in a social policy, but that doesn’t seem to be too successful. But I would hardly encourage everybody to do a periodic search on your own domain, just to see what’s out there.
You’ll see stuff out there, I see confidential presentations being put out on some temporary website that was set up, and it has stuff we hold patents on. I guarantee you it’s probably on your domain too. So take your domain, put it through a couple of these – and you might be surprised.
IceRocket – I like more because of the Twitter and Myspace stuff, same type of deal (see image). There was a government exercise I participated in, called ‘Cyber Storm’, I just wanted to see if it would pick up some stuff in ‘Cyber Storm’, which is a DHS exercise testing cyber readiness. I was surprised to see what people were talking about from around the world: from Germany, from quite a few other places.
So, another good tool is Bloglines (see image), it does a lot of good stuff with names. BlogPulse – this starts getting into a lot of the graphic stuff. We are going to talk about how graphical some of these places are going. In fact, now they are coming with what’s called ‘visualizers’, which is a new technology. There is a Facebook visualizer, which is one only available to law enforcement, but that’s supposed be open to the public quite soon. And you can actually look and see, like, here is Lance Hawk, here is Lance Hawk’s friend. It does a visualization profile. ‘NCIS: Los Angeles’1 – I don’t know if anybody watches that, but you see where they are grabbing the stuff and bringing it down. It’s the same type of concept.
Okay now, searching Wikis – probably the nicest one I like is Qwika when it comes to that. It doesn’t seem to get as much of the press as the others, but I like that a lot more. It is much more simple, it seems to return more. Again, it’s just very simple, just like with a blog, it’s like with Google. Say, I am searching for ‘Surfynol 104’ – that product I talked about. I search the Wikis, and we were shocked at some of the information we found. You know, people are escaping from blogs and maybe going to Wikis, or vice versa. It would be nice if I could give you the super search engine, where you just go to one place, but that does not exist. Especially see what LinkedIn is doing and how they are carving out space. And that’s probably gonna continue, which means there are going to be more and more specialized searching and search engines.
Now, searching Tweets. We have Twitter itself, Twittorati and Monitter. Monitter is pretty slick. You’ll see how different that is, if you haven’t already.
Recently they’ve added a new capability to Twitter. You can actually see mini profiles now, I use this capability all the time. It’s like a summary, think of that as a summary: it has bio, recent tweets. What’s really neat too is a lot of people start to embed photos and videos in Twitter. And you actually get to see a photo, video embedded – that type of deal. And that’s just recently added.
I do not use Twitter too much. I just occasionally put stuff out there, just to keep an account going if I ever have to need it to do something. You know, in this case you see some of the tweets that I had. I’m actually quite restrictive on followers. Twitter itself is a good search engine, just like eBay itself is a good search engine, and we will talk later about eBay. I am shocked at the amount of information I find even on eBay.
Twittorati is almost like little Twitter. I find stuff on that, I don’t find it on Twitter, I can’t explain why.
Now, where Monitter is unique and good is you can actually use multiple, up to three keywords, at one time while conducting your search (see image). And you got be very careful with this because if it is something that’s active, your screen is gonna be flying. And you talk about network bandwidth, and if you have a place which is bandwidth sensitive, you wanna be careful there. Let’s say ‘Air Products’ and maybe some other competitor – where we basically use this is for intellectual assets protection searching.
Searching for People
Now, searching for people. Most of these are free. What you can see here is the technology has been changing, where when you search for people you used to have email right away; now what they are starting to do is obfuscating by putting something like asterisks and some of the fields – like, you might see H*** for my last name. You’ll always see the domain though, but you’ll never see the full name anymore. And now what they want you to do is contact such and such. What I end up doing is I take that H and then the asterisks, and I will do a wildcard with that and I go back to Google to start searching with that. So you are kind of reverse engineering stuff.
Probably out of all these, my favorite one is Pipl. It’s not restricted to the U.S. A lot of these are just U.S. based. We had a resent investigation in the Netherlands, we were tracking stuff in the Netherlands. I found exactly what I wanted on Pipl. I had started out the investigation that’s said, you know: “This person has this ID”. And I searched on the ID. I had a specific ID, so I started with Google. And that got hits on eBay ID. Now I looked in the eBay ID, I did a history search on that. I saw other IDs that the user uses, which eBay will show you. Then I did searches on his other IDs, and that’s when I found information and it actually brought the cases that we wanted. So it’s a whole process.
There is another one, ZabaSearch (see image). If you do a phone number lookup, you actually default out to another search, and it says: “We found your information, for $29.95 we’ll return your information”. I’ll save you $29.95, they don’t return your information. Well, I actually didn’t say that… They might return your information, but a lot of times I tried they did not return my information that I wanted. Phone number lookup defaults to Intelius, and it’s a premium search.
But this by itself is not a premium search. It is restricted to U.S. If you are sure what state the person is in, again, do that because it’s gonna return a heck of a lot of hits that you might have to call through.
ZoomInfo is a database specifically for business people (see image). It’s almost like a subset of LinkedIn. This will tell you more. For instance I did a search on myself. Again, I didn’t restrict that to state. And I got I am a member of Nevada Area Council Inc., but I am not a member. Obviously that was not me. Obviously the second one is not me. Hey, the third one is me. And if you see the checked aside email and phone, I did follow out just to see if it would disclose it. And it went to another service, which says: “For $29.95 we will return this”. You know, I didn’t pay the $29.95. So this is more for business.
Here is my favorite one – Pipl. This is the biggest thing out there, providing reverse lookup capability, email, username. I mean, you can be dealing with some piece of evidence. A lot of times we might start off with an email from, say, email@example.com. I wanna know who hawklp is. Well, we have an email address, and we have a username. So there are two ways to start the search on there. And then, like I said, you might go out to Google and submit that. And that might say: “He is an eBay user”. So you go to eBay. So you work all this stuff together, but that is my favorite one. It is free and it is good.
Intelius seems to be like a number one thing that defaults out people search. And Social Security Number – that’s an interesting one. The Social Security, what’s called the death index, is actually publicly searchable, which blew my mind. Less than a year ago, I thought, I never met my father, so a good test would be to try and track him down. So I went through all these things. I found one place he was on. And you will never guess the place he was on – Ancestry.com. It really blew my mind. I am thinking of myself like a Mister Pro, I am using this advanced thing and that advanced thing – and it pops on this ancestry.com. You will be surprised – it listed his Social Security Number.
Searching for Email
Now, searching for email. As I said before, right now there is no good email finder by itself. We’ve talked about lot of products like Spokeo and stuff like that that you can do the reverses search and all that with. A lot of them are masking now. So sometimes you might get a message saying: “For 29.95 we will tell you”. You can try that, but before you try that I am telling you – 50% of the time you’re gonna have successes. Just take what they give you and mask, and then do a regular Google search, and you’ll probably get it. But if you don’t, you know this is the type of product you can use.
Email Finder hosted at emailfinder.com is probably the only email finder I do subscribe to, besides Spokeo. And it actually is what it says (see image). I actually use it for some other functions, specifically blacklist checker. It’s one of the only ones I know that checks against the top blacklists out there. Because I wanna know that maybe I am not getting information because this guy is on some blacklist or something like that, so I always check that. EmailValidator is pretty good too. It will actually identify whether or not it’s a valid email address.
123people will show business emails but not personal. You know, you always get returned stuff like this, you get H and then you see the P, stuff like that. Well, you can search on that. And if I use a different search engine, I’ll actually get the airproducts.com but I want to get a piece upfront. And you start to put it together, and then you cut and paste, and then you do a search for that in Google, and a lot of times you’ll end up getting it.
Reverse Phone Lookups
Now, other old reverse lookups. Probably the best one up here which I like, which I’ve given to my kids and I’ve used a million of times is Whocalledus. Everybody gets those annoyance phone calls or phone calls on credit collections; you know, it says: “Number unavailable” and stuff like that. That’s sweet, plug it into that. And it is free, so you know who it is for. There are old classics here – Whitepages and Anywho – but like I said, I like Whocalledus.
PeopleLookup – 5 dollars per use – I’ve tried that several times and I’ve been unsuccessful on that. But I have achieved success on Reverse Phone Check.
Other investigative sites (see image) – I talked about the concept of visualizers, which is something new. It’s actually out now for some, but not for the YouTube.
SearchTempest – does CraigsList and eBay, and Amazon. It has Amazon, fantastic website. And it is one of the few that do it worldwide. You can actually restrict it to the region. When I talked about the one investigation when I plugged it in eBay before, I was misleading a bit – I plugged it into this first, and then it actually gave me an eBay lead.
If you have access to your places, you wanna check for any activity on backpage.com. That seems to be the wild site right now that people are dumping all kinds of crap. It used to be CraigsList that people said was going downhill; well, the downhill one right now is actually backpage.com, so you wanna watch that too.
We’re gonna over pacer.gov – that’s slick, I’ll show you how you can get free stuff from the government; prbpub.com, which is more state, links to PeopleSmart; and ssnvalidator.com
Okay, Lococitato is a Facebook visualizer, LEO (Law Enforcement Officer) only (see image to the left). At any time it is supposed to be out for public. Myspace and YouTube visualizers. I’m only bringing that up just for people to see new technologies coming out, new capability. It’s a visual type capability, different than what we have.
Now, everybody wants to know about tracing IP addresses. The biggest change here in the last year or two, has been geolocation, and the capability to actually track by IP addresses. Use the first two depending on whether you are searching the U.S. or overseas. Here is the collection of tools, they are great sites to look at. But probably one of the newer ones is whatismyipaddress.com/ip-lookup. When you get home and you are identifying your IP address, plug it in there – and it’s amazing, and it’s free. You plug in your IP address at home, and you can see the map coming up – fantastic to track back.
What’s considered the forensics god is dnsstuff.com. That actually does quadruple geolocation, where you have four satellites coming in, just so that you are sure if you have to go to court.
Now, pacer.gov – you always want to know, a lot of times when you do an investigation, in regards to litigation, if there has been any prior litigation from a federal or state perspective. What Pacer does is it includes federal information. Not only does it include federal information, but it gets it out there, there are government standards that dictate how fast it gets it out there. The bad part for an investigator is that there is some information that is redacted. Obviously, Social Security Numbers are among that, of course. There is other information like addresses and stuff like that.
But if we go to it, and this what the website looks like (see image), I put in my query – if I am investigating, like, Surfynol 104 that I mentioned before, and there is somebody who is basically misappropriating some of our information. And I knew that was, say, ACME Tech. So I looked up on this ACME Tech and I saw some litigation they were involved with, and it was 60 pages or something like that. Access to the court documents is 8 cents per page. You are capped at $2.40, but the best thing is this one here: the billing gets quarterly, but if you don’t go over 10 bucks – it’s for free. So if you are only doing it once or twice, you can get some nice court related information there.
Another slick one is SSN Validator (see image). We all come across Social Security Numbers. The big thing I like about this is that there is at least a check: number one – it will say whether it’s been issued, number two – it will say whether the person is deceased or not. But what I don’t like is a lot of things, like if you do credit reports it will redact your Social Security Number. So if you ever go and put it to any output or something like that, you really have to be cognizant of the information you are working with and know that it can get out there.
Determining risk to reputation – I never thought I would be spending time on finance message boards, but I am now. It is amazing, I don’t know if people are aware but if you are a publicly traded company (‘Air Products’ is a publicly traded company, our stock symbol is APD), I go out to finance message boards, and the stuff I see is unbelievable. And it is stuff you got to react to. One time for instance, there was something that appeared that said: “I stole 600,000 dollars from ‘Air Products’. I cracked into their recipe system”. Well, that’s gone out to the public, right? Yes, we have to pay attention to that, maybe address that, and do something like that. So that is something, like I said, where on a quarterly basis you wanna set up something to you look into stuff like that.
Let’s talk about two other places. There is one called jobvent.com. I can’t believe this website is up. I mean this is just, in my humble opinion, a bunch of ticked off people just job venting. But it’s amazing just from an intelligence perspective what you get. People say a lot of bad things when they are teed off. But Jobvent is one of the two.
The other one is glassdoor.com. At least, An opposing viewpoint can be presented here. So this is a little bit better.
Here is actually a newer one – trackle.com. And what I like about this is, remember I talked about brand monitoring and keeping an eye on your competitors and stuff like that, so this is making its reputation just on that.
Now, let’s assume all of those fail – I keep on going back to that sample product Surfynol 104. Say, I didn’t find anything in Cache, and somebody has brought the website down. This is oldie, but this is goodie – archive.org (see image). I am telling you, at least 30-50 times I’ve solved cases just using this one website, you know, what’s called the Wayback Machine. What you end up doing is, right there where it says “Take Me Back”, you put the URL: www.airproducts.com, and this goes all the way across. So you see the different times it’s been archived. So if somebody said: “You know what, I know that graphic was up there on June the 20th, 2004”, I can actually go back to June 19th backup, pull that out, and there it is. Then we can capture and preserve our evidence.
Now, if you forgot everything I said, there is an all-in-one that contains links to everything going in and out, all kind of stuff. That’s pandia.com. Actually Yahoo has one, and there is a government agency that has one, but Pandia seems to have a lot more. And this is just what is referred to as a Power Search Engine, which is totally different. This just refers you out to other search engines.