Forums
New posts
Search forums
What's new
New posts
New media
New media comments
Latest activity
Classifieds
Media
New media
New comments
Search media
Log in
Register
What's New?
Search
Search
Search titles only
By:
New posts
Search forums
Menu
Log in
Register
Navigation
Install the app
Install
More Options
Advertise with us
Contact Us
Close Menu
JavaScript is disabled. For a better experience, please enable JavaScript in your browser before proceeding.
You are using an out of date browser. It may not display this or other websites correctly.
You should upgrade or use an
alternative browser
.
Forums
Hobbies & Interests
Preppers' Corner
Internet security
Search titles only
By:
Reply to Thread
This site may earn a commission from merchant affiliate links, including eBay, Amazon, and others.
Message
<blockquote data-quote="1krr" data-source="post: 2613914" data-attributes="member: 750"><p>Best way to keep your information from being abused is to know how it is collected and analysed. You might here lots of things about "big data" and with the Snowden revelation, it confirmed what most in the industry already knew was going on. In basic terms, think of your information like a mosaic; perhaps one of those mosaics made up of thosands of pictures that when you step back, you see the "big picture." </p><p></p><p>That is a great metephore for all the little tidbits of information you leave around the internet. This post for example doesn't really tell you much about me. My username is 1krr and with some research, you might find that I own, used to own, or have an interest in motorcycles and most specifically in a CBR1000RR sport bike. Now the PIs and general sleuths in the room will get that "ahha" moment an start to think that is interesting. But there are also some facts like I'm posting on an Oklahoma centric firearms forum, speficially in the "preppers" forum probably tells you a great deal about me. Of course none of this is intended to be racially/culturally insensitive just raw analysis. It would seem to suggest a better than 50% odds that I'm white 70% that I'm male. I like motorcycles, guns, and prepperadness. I've been registered for a long time but post rarely over the span of time which might suggest I tend to be reclusive or uncomfortable with people I don't know. But the social nature of much of the above hobbies would suggest that while I do not seek out attention in big crowds I might be deeply involved in smaller, tight-nit, communities surrounding car shows, shooting ranges, etc.</p><p></p><p>In any case, the above is an overly simplified way of using some, what seems like benign information to identify me for x purpose. But this is good ol' detective work and is not data science. As a data scientist, I take a different approach. I would pick out all the bits of data that uniqily identify everyone and associate them with each other. I wouldn't read 1krrs posts on ok-shooters, I would collect every post by anyone on okshooters and store it. Then I would ignore the context of the words but read the words themselves. Misspellings are a big one. Who mispelled the same word in the same way? Puncuation and capilization? Do they use a lot of commas and where? Do they tend to use abbriviated words or compounded words? Type your post then turn all the letters white where you can't see them and what is left? A pattern of symbols. Once I have patterns, I look to see how those patterns associate with other patterns and that the outcomes typically are. With a few iterations of this, one having done the analysis can tell you fairly affirmatively, who you are, what you like, how you act in random situations, and with that, how I can manipulate you into doing something I want you to do. Thankfully, for the most part, the end result of all this is some very successful advertising that makes companies ALOT of money. </p><p></p><p>But it can be more sinister. If you are familiar with binary numbering (the same as a computer uses), then when I say it only takes 33 bits of information to affirmatively identify anyone in the world, anywhere. In binary, it's just a symple 2 digit system, 0 and 1. Each bit represent some peice of information. Counting them to ten, 0 = 0, 1 = 1, 10 = 2, 11 = 3. 100 = 4. 101 = 5, 110 = 6, 111 = 7, 1000 = 8, 1001 = 9, and 1010 = 10. Since we only have two digits, basicially and on and off, you see how we got four bits to represent 10 things.Each additional bit added increases the number by ^2. So what does this mean? It means by collecting and compiling huge volumes of little tidbits of info, I can identify with 100% accuracy, who you are, and from the context of those bits, I can tell you what you like, where you go, with whome, when, and by sneaking in some statistics, could compel you within reason to act the way I want you to (ie to go buy that damn coach purse, you know you want it, you've said so in every social media site you own, which I've corollated to you using all the tidbits you've left around, you wonderful little creature of habit, you! Hell, I can tell alot about you by what you never say (don't post your location, birth date, etc). </p><p></p><p>This is why Facebook, Amazon, Google, and many others spend billions in technology research and infrastructure because big data allows the right people so much good knowldge into your life, their slight nudge in the form of a well time and properly colored ad gets the sale 80% of the time. Business use the way you interact with their websites to profile/pattern you and generate content most relevent to your willingness to seperate your money from your wallet in exchange for some high markup widget they are selling. People use this information in all kinds of ways. Security guys use patterns to identify or predict the type of attack they may face. </p><p></p><p>Fun read about the basics here:</p><p><a href="http://blogs.wsj.com/digits/2010/08/04/the-information-that-is-needed-to-identify-you-33-bits/" target="_blank">http://blogs.wsj.com/digits/2010/08/04/the-information-that-is-needed-to-identify-you-33-bits/</a></p><p></p><p>Now what do you do about it? Because the data is insignificant people have no problems happily typing it into Facebook, Pintrest, Gmail (hosted email is a BIG one), etc. And because it is seemly insignificant, it's generally fairly honest, Big data relies on being able to pull together realitively factual and accurate information. Since it's basically collecting and storing huge volumes of data to search out and corolate patterns, the first step to annonimity is to change your paterns. Use periods where commas are supposed to do. Don't ignore the intrusion into your PII (personally identifyable information) but skipping those fields, fill them out! If you are 23 year old asian female, become a 57 year old white male. Accurately filling that information out is a gold mine for big data algorythms. One of the major algorythms is calls map/reduce and it's just a way of sucking up huge bits of data and making a table of context or an index which will tell you where all the instances some bit you are looking for it located quickly. And you can compound those searches to genreate patterns to see what you can learn about your subject. Change some of that data and all of a sudden, you are fitting into patterns that don't make sense. Change your profile. My facebook account thinks I'm a 21 year old student at OU. I may be but it doesn't seem like it compared to most of my open books friends and It doesn't change how I use the system. It does change the advertising I see if I see any at all. If you use amazon and find it a little creepy that they know so much about what you are going to buy; buy random things. If you have no kids, buy a box of diapers for a friend that has kids. Order tools and toenail polish (makes good improvised machinists blue). Be contradictory in what you put in. If you read this post closely, you will see lots of commas and abbriviations mismatch with some mispelled words (I really am a lazy typer so that's not uncommon). </p><p></p><p>Other things are, use multiple browsers. Use Firefox instead of Chrome sometimes since sites log your useragent (basically a fingerprint of the browser and the computer it is running on). The later has a useragent switcher that you can use to turn Firefox into any propular browser out there. You can use TOR networks but the only people who do are typically either honest to goodness bad guys or people trying to bypass their work url filter to watch porn or get to okshooters! (easier ways to do this). In anycase, rambled enough but ask question, I'm happy to talk all about big data analytics because it's a realm I'm working in and do very much enjoy (but I don't do the advertising side of things).</p></blockquote><p></p>
[QUOTE="1krr, post: 2613914, member: 750"] Best way to keep your information from being abused is to know how it is collected and analysed. You might here lots of things about "big data" and with the Snowden revelation, it confirmed what most in the industry already knew was going on. In basic terms, think of your information like a mosaic; perhaps one of those mosaics made up of thosands of pictures that when you step back, you see the "big picture." That is a great metephore for all the little tidbits of information you leave around the internet. This post for example doesn't really tell you much about me. My username is 1krr and with some research, you might find that I own, used to own, or have an interest in motorcycles and most specifically in a CBR1000RR sport bike. Now the PIs and general sleuths in the room will get that "ahha" moment an start to think that is interesting. But there are also some facts like I'm posting on an Oklahoma centric firearms forum, speficially in the "preppers" forum probably tells you a great deal about me. Of course none of this is intended to be racially/culturally insensitive just raw analysis. It would seem to suggest a better than 50% odds that I'm white 70% that I'm male. I like motorcycles, guns, and prepperadness. I've been registered for a long time but post rarely over the span of time which might suggest I tend to be reclusive or uncomfortable with people I don't know. But the social nature of much of the above hobbies would suggest that while I do not seek out attention in big crowds I might be deeply involved in smaller, tight-nit, communities surrounding car shows, shooting ranges, etc. In any case, the above is an overly simplified way of using some, what seems like benign information to identify me for x purpose. But this is good ol' detective work and is not data science. As a data scientist, I take a different approach. I would pick out all the bits of data that uniqily identify everyone and associate them with each other. I wouldn't read 1krrs posts on ok-shooters, I would collect every post by anyone on okshooters and store it. Then I would ignore the context of the words but read the words themselves. Misspellings are a big one. Who mispelled the same word in the same way? Puncuation and capilization? Do they use a lot of commas and where? Do they tend to use abbriviated words or compounded words? Type your post then turn all the letters white where you can't see them and what is left? A pattern of symbols. Once I have patterns, I look to see how those patterns associate with other patterns and that the outcomes typically are. With a few iterations of this, one having done the analysis can tell you fairly affirmatively, who you are, what you like, how you act in random situations, and with that, how I can manipulate you into doing something I want you to do. Thankfully, for the most part, the end result of all this is some very successful advertising that makes companies ALOT of money. But it can be more sinister. If you are familiar with binary numbering (the same as a computer uses), then when I say it only takes 33 bits of information to affirmatively identify anyone in the world, anywhere. In binary, it's just a symple 2 digit system, 0 and 1. Each bit represent some peice of information. Counting them to ten, 0 = 0, 1 = 1, 10 = 2, 11 = 3. 100 = 4. 101 = 5, 110 = 6, 111 = 7, 1000 = 8, 1001 = 9, and 1010 = 10. Since we only have two digits, basicially and on and off, you see how we got four bits to represent 10 things.Each additional bit added increases the number by ^2. So what does this mean? It means by collecting and compiling huge volumes of little tidbits of info, I can identify with 100% accuracy, who you are, and from the context of those bits, I can tell you what you like, where you go, with whome, when, and by sneaking in some statistics, could compel you within reason to act the way I want you to (ie to go buy that damn coach purse, you know you want it, you've said so in every social media site you own, which I've corollated to you using all the tidbits you've left around, you wonderful little creature of habit, you! Hell, I can tell alot about you by what you never say (don't post your location, birth date, etc). This is why Facebook, Amazon, Google, and many others spend billions in technology research and infrastructure because big data allows the right people so much good knowldge into your life, their slight nudge in the form of a well time and properly colored ad gets the sale 80% of the time. Business use the way you interact with their websites to profile/pattern you and generate content most relevent to your willingness to seperate your money from your wallet in exchange for some high markup widget they are selling. People use this information in all kinds of ways. Security guys use patterns to identify or predict the type of attack they may face. Fun read about the basics here: [url]http://blogs.wsj.com/digits/2010/08/04/the-information-that-is-needed-to-identify-you-33-bits/[/url] Now what do you do about it? Because the data is insignificant people have no problems happily typing it into Facebook, Pintrest, Gmail (hosted email is a BIG one), etc. And because it is seemly insignificant, it's generally fairly honest, Big data relies on being able to pull together realitively factual and accurate information. Since it's basically collecting and storing huge volumes of data to search out and corolate patterns, the first step to annonimity is to change your paterns. Use periods where commas are supposed to do. Don't ignore the intrusion into your PII (personally identifyable information) but skipping those fields, fill them out! If you are 23 year old asian female, become a 57 year old white male. Accurately filling that information out is a gold mine for big data algorythms. One of the major algorythms is calls map/reduce and it's just a way of sucking up huge bits of data and making a table of context or an index which will tell you where all the instances some bit you are looking for it located quickly. And you can compound those searches to genreate patterns to see what you can learn about your subject. Change some of that data and all of a sudden, you are fitting into patterns that don't make sense. Change your profile. My facebook account thinks I'm a 21 year old student at OU. I may be but it doesn't seem like it compared to most of my open books friends and It doesn't change how I use the system. It does change the advertising I see if I see any at all. If you use amazon and find it a little creepy that they know so much about what you are going to buy; buy random things. If you have no kids, buy a box of diapers for a friend that has kids. Order tools and toenail polish (makes good improvised machinists blue). Be contradictory in what you put in. If you read this post closely, you will see lots of commas and abbriviations mismatch with some mispelled words (I really am a lazy typer so that's not uncommon). Other things are, use multiple browsers. Use Firefox instead of Chrome sometimes since sites log your useragent (basically a fingerprint of the browser and the computer it is running on). The later has a useragent switcher that you can use to turn Firefox into any propular browser out there. You can use TOR networks but the only people who do are typically either honest to goodness bad guys or people trying to bypass their work url filter to watch porn or get to okshooters! (easier ways to do this). In anycase, rambled enough but ask question, I'm happy to talk all about big data analytics because it's a realm I'm working in and do very much enjoy (but I don't do the advertising side of things). [/QUOTE]
Insert Quotes…
Verification
Post Reply
Forums
Hobbies & Interests
Preppers' Corner
Internet security
Search titles only
By:
Top
Bottom