п»ї Bitcoin mining ubuntu 11.04

ubuntu for bitcoin mining

Currently, 11.04 is bitcoin what is done — mining pieces of the transaction are omitted. We've already discussed the following example. Im running a miner on ubuntu laptop with an ATI mobile. The outlet provides 30 amps, but one miner needs about 7. Thank you so much for the quick reply. Lenny January 11, 1:

bitcoin downtown vancouver В»

avalon2 bitcoin 105g

Lease a space for operations Obtain a Certificate of Occupancy pro tip — make changes to space after getting CO Get account with power company and internet service provider Bitcoin Mining Operations Get electrical pulled into space Set up natural circulation cooling Configure data network Configure power distribution to Bitcoin miners Operate miners and scale up Keep track of all purchases and expenses for tax purposes. It might make it a little more difficult for DDOS attackers to overload the server with getwork-requests. Ken November 21, 7: Travel eats up costs in a hurry. The crawler thread simply iterated over the keys in this dictionary, looking for the next domain it was polite to crawl.

bitcoin daily exchange volume mount В»

verifying block database integrity bitcoin values

Thank you so much for all of your mining experience and success. In particular, mining we define area in this example, what set! Michel November 21, 5: I have a Xiaomi phone so that would be a 11.04 match for their smart ubuntu. What space are you looking for? Quantas horas o computador 11.04 que ficar ligado ubuntu ganhar 1 Bitcoin At first bitcoin double spending seems difficult for Alice to pull off.

leelanau physical bitcoins and bitcoins worth it В»

Bitcoin mining ubuntu 11.04

Make sure it accepts PCIe 2. Hero Member Offline Activity: If you're in the U. Full Member Offline Activity: Buying hardware and power for mining is not going to be profitable.

Best way is to use a machine you already have and power you don't pay for. If I'm actually wrong and there is a way to turn a profit with purpose-bought hardware, I'd love to know about it.

After that it's pure profit. Most people assume difficulty will continue to rise, but I think it will level off soon, because except for unusual deals like this one, GPU mining is close to becoming unprofitable. Im running a miner on my laptop with an ATI mobile. Would be good if they put them on PCIe boards for a desktop. If looking at the long term, as long as difficulty rate doesn't change toooo quickly, it may be worth the investment.

Matter of fact I nearly got that laptop for going back to the spring semester at university, but built my current desktop instead for an equal price. Which the desktop turned out to be multiples more powerful in other ways, just apparently not bitcoin mining. I'm kindof on the fence about whether the increased investment Which isn't so bad when you take into account the fact that a laptop is its own entire system, not "just" one super expensive component.

It may in the long run, but I suppose mileage may vary. Alternatively, a gtxm laptop: Note that the listed power draw is W for the ENTIRE laptop, while the gtxm is said to be almost identical to a desktop gtx in performance.

Akilae on January 05, , January 06, , Sorry, can't tell for sure. Also if you plan to get more than 2 in the 4-way mobo then the liquid cooling upgrade is mandatory. Cooling I suggest you get the Koolance block for the Refer to this review please: Case Make sure you have at least 8 slots in the back for the GFX. I also run some other GPUs on Ubuntu Maybe it's a problem with the final version of Catalyst Odd that it works with Phoenix and not with BitMinter, though.

Full Member Offline Activity: I love this program I will be adding some video cards soon when I get a deal on some 's Thanks for joining, ManOfKnight. We are growing steadily and hopefully payouts will be more regular soon.

Some updates for the software today: Much requested features - sorry it took so long! Upgraded to latest version of Steel Series.

The power meter tick marks looked weird for some CPU models in the old version - this looks better now. Some users got a blank login window with nowhere to enter name and password. I'm not sure what causes this, but at least now the login window should be usable. This has been fixed. You will get the new version the next time you restart the miner. Quick fix on the pool backend a few minutes ago. It would sometimes send out work that the new miner version did not like, which would result in miscalculations "bombs" in the GUI.

This is now fixed. Sorry if this caused you to think you had a hardware problem. Yes, the problem was only there for a few hours and produced just a few bombs. Is the GPU overclocked? August 26, , We will have reinvented Bitcoin! This strategy is slower than if I explained the entire Bitcoin protocol in one shot. But while you can understand the mechanics of Bitcoin through such a one-shot explanation, it would be difficult to understand why Bitcoin is designed the way it is.

The advantage of the slower iterative explanation is that it gives us a much sharper understanding of each element of Bitcoin. You may find these interesting, but you can also skip them entirely without losing track of the main text. On the face of it, a digital currency sounds impossible. If Alice can use a string of bits as money, how can we prevent her from using the same bit string over and over, thus minting an infinite supply of money? Or, if we can somehow solve that problem, how can we prevent someone else forging such a string of bits, and using that to steal from Alice?

These are just two of the many problems that must be overcome in order to use information as money. Suppose Alice wants to give another person, Bob, an infocoin. She then digitally signs the message using a private cryptographic key, and announces the signed string of bits to the entire world. A similar useage is common, though not universal, in the Bitcoin world. But it does have some virtues.

So the protocol establishes that Alice truly intends to give Bob one infocoin. The same fact — no-one else could compose such a signed message — also gives Alice some limited protection from forgery.

To make this explicit: Later protocols will be similar, in that all our forms of digital money will be just more and more elaborate messages [1]. A problem with the first version of Infocoin is that Alice could keep sending Bob the same signed message over and over. Does that mean Alice sent Bob ten different infocoins? Was her message accidentally duplicated?

Perhaps she was trying to trick Bob into believing that she had given him ten different infocoins, when the message only proves to the world that she intends to transfer one infocoin. They need a label or serial number. To make this scheme work we need a trusted source of serial numbers for the infocoins.

One way to create such a source is to introduce a bank. This bank would provide serial numbers for infocoins, keep track of who has which infocoins, and verify that transactions really are legitimate,. Instead, he contacts the bank, and verifies that: This last solution looks pretty promising.

However, it turns out that we can do something much more ambitious. We can eliminate the bank entirely from the protocol. This changes the nature of the currency considerably. It means that there is no longer any single organization in charge of the currency. The idea is to make it so everyone collectively is the bank. You can think of this as a shared public ledger showing all Infocoin transactions. Now, suppose Alice wants to transfer an infocoin to Bob.

A more challenging problem is that this protocol allows Alice to cheat by double spending her infocoin. And so they will both accept the transaction, and also broadcast their acceptance of the transaction. How should other people update their block chains? There may be no easy way to achieve a consistent shared ledger of transactions.

And even if everyone can agree on a consistent way to update their block chains, there is still the problem that either Bob or Charlie will be cheated. At first glance double spending seems difficult for Alice to pull off.

After all, if Alice sends the message first to Bob, then Bob can verify the message, and tell everyone else in the network including Charlie to update their block chain. Once that has happened, Charlie would no longer be fooled by Alice. So there is most likely only a brief period of time in which Alice can double spend.

Worse, there are techniques Alice could use to make that period longer. She could, for example, use network traffic analysis to find times when Bob and Charlie are likely to have a lot of latency in communication. Or perhaps she could do something to deliberately disrupt their communications. If she can slow communication even a little that makes her task of double spending much easier.

How can we address the problem of double spending? Rather, he should broadcast the possible transaction to the entire network of Infocoin users, and ask them to help determine whether the transaction is legitimate.

If they collectively decide that the transaction is okay, then Bob can accept the infocoin, and everyone will update their block chain. Also as before, Bob does a sanity check, using his copy of the block chain to check that, indeed, the coin currently belongs to Alice.

But at that point the protocol is modified. Other members of the network check to see whether Alice owns that infocoin. This protocol has many imprecise elements at present. Fixing that problem will at the same time have the pleasant side effect of making the ideas above much more precise.

Suppose Alice wants to double spend in the network-based protocol I just described. She could do this by taking over the Infocoin network. As before, she tries to double spend the same infocoin with both Bob and Charlie. The idea is counterintuitive and involves a combination of two ideas: The benefit of making it costly to validate transactions is that validation can no longer be influenced by the number of network identities someone controls, but only by the total computational power they can bring to bear on validation.

But to really understand proof-of-work, we need to go through the details. For instance, another network user named David might have the following queue of pending transactions:. David checks his copy of the block chain, and can see that each transaction is valid.

He would like to help out by broadcasting news of that validity to the entire network. However, before doing that, as part of the validation protocol David is required to solve a hard computational puzzle — the proof-of-work. What puzzle does David need to solve? Bitcoin uses the well-known SHA hash function, but any cryptographically secure hash function will do. Suppose David appends a number called the nonce to and hashes the combination.

The puzzle David has to solve — the proof-of-work — is to find a nonce such that when we append to and hash the combination the output hash begins with a long run of zeroes. The puzzle can be made more or less difficult by varying the number of zeroes required to solve the puzzle.

A relatively simple proof-of-work puzzle might require just three or four zeroes at the start of the hash, while a more difficult proof-of-work puzzle might require a much longer run of zeros, say 15 consecutive zeroes. We can keep trying different values for the nonce,. Finally, at we obtain:. This nonce gives us a string of four zeroes at the beginning of the output of the hash. This will be enough to solve a simple proof-of-work puzzle, but not enough to solve a more difficult proof-of-work puzzle.

What makes this puzzle hard to solve is the fact that the output from a cryptographic hash function behaves like a random number: So if we want the output hash value to begin with 10 zeroes, say, then David will need, on average, to try different values for before he finds a suitable nonce. In fact, the Bitcoin protocol gets quite a fine level of control over the difficulty of the puzzle, by using a slight variation on the proof-of-work puzzle described above.

This target is automatically adjusted to ensure that a Bitcoin block takes, on average, about ten minutes to validate. In practice there is a sizeable randomness in how long it takes to validate a block — sometimes a new block is validated in just a minute or two, other times it may take 20 minutes or even longer.

Instead of solving a single puzzle, we can require that multiple puzzles be solved; with some careful design it is possible to considerably reduce the variance in the time to validate a block of transactions.

Other participants in the Infocoin network can verify that is a valid solution to the proof-of-work puzzle. And they then update their block chains to include the new block of transactions.

For the proof-of-work idea to have any chance of succeeding, network users need an incentive to help validate transactions. The solution to this problem is to reward people who help validate transactions. In particular, suppose we reward whoever successfully validates a block of transactions by crediting them with some infocoins. Provided the infocoin reward is large enough that will give them an incentive to participate in validation.

In the Bitcoin protocol, this validation process is called mining. For each block of transactions validated, the successful miner receives a bitcoin reward.

Initially, this was set to be a 50 bitcoin reward. But for every , validated blocks roughly, once every four years the reward halves.

This has happened just once, to date, and so the current reward for mining a block is 25 bitcoins. This halving in the rate will continue every four years until the year CE. At that point, the reward for mining will drop below bitcoins per block.

So in CE the total supply of bitcoins will cease to increase. Bitcoin also makes it possible to set aside some currency in a transaction as a transaction fee , which goes to the miner who helps validate it.

In the early days of Bitcoin transaction fees were mostly set to zero, but as Bitcoin has gained in popularity, transaction fees have gradually risen, and are now a substantial additional incentive on top of the 25 bitcoin reward for mining a block.

You can think of proof-of-work as a competition to approve transactions. Each entry in the competition costs a little bit of computing power. So, for instance, if a miner controls one percent of the computing power being used to validate Bitcoin transactions, then they have roughly a one percent chance of winning the competition.

So provided a lot of computing power is being brought to bear on the competition, a dishonest miner is likely to have only a relatively small chance to corrupt the validation process, unless they expend a huge amount of computing resources.

Before doing that, I want to fill in an important detail in the description of Infocoin. The pointer is actually just a hash of the previous block. So typically the block chain is just a linear chain of blocks of transactions, one after the other, with later blocks each containing a pointer to the immediately prior block:.

Occasionally, a fork will appear in the block chain. This can happen, for instance, if by chance two miners happen to validate a block of transactions near-simultaneously — both broadcast their newly-validated block out to the network, and some people update their block chain one way, and others update their block chain the other way:.

The rule is this: But at any given time, miners only work to extend whichever fork is longest in their copy of the block chain. Suppose, for example, that we have a fork in which some miners receive block A first, and some miners receive block B first. Those miners who receive block A first will continue mining along that fork, while the others will mine along fork B. After they receive news that this has happened, the miners working on fork A will notice that fork B is now longer, and will switch to working on that fork.

Presto, in short order work on fork A will cease, and everyone will be working on the same linear chain, and block A can be ignored. Of course, any still-pending transactions in A will still be pending in the queues of the miners working on fork B, and so all transactions will eventually be validated.

Likewise, it may be that the miners working on fork A are the first to extend their fork. In that case work on fork B will quickly cease, and again we have a single linear chain.

No matter what the outcome, this process ensures that the block chain has an agreed-upon time ordering of the blocks. In Bitcoin proper, a transaction is not considered confirmed until: This gives the network time to come to an agreed-upon the ordering of the blocks. Suppose Alice tries to double spend with Bob and Charlie. One possible approach is for her to try to validate a block that includes both transactions.

Assuming she has one percent of the computing power, she will occasionally get lucky and validate the block by solving the proof-of-work. Unfortunately for Alice, the double spending will be immediately spotted by other people in the Infocoin network and rejected, despite solving the proof-of-work problem. A more serious problem occurs if she broadcasts two separate transactions in which she spends the same infocoin with Bob and Charlie, respectively.

She might, for example, broadcast one transaction to a subset of the miners, and the other transaction to another set of miners, hoping to get both transactions validated in this way. In fact, knowing that this will be the case, there is little reason for Alice to try this in the first place.

She will then attempt to fork the chain before the transaction with Charlie, adding a block which includes a transaction in which she pays herself:. And unless Alice is able to solve the proof-of-work at least as fast as everyone else in the network combined — roughly, that means controlling more than fifty percent of the computing power — then she will just keep falling further and further behind.

Of course, she might get lucky. We can, for example, imagine a scenario in which Alice controls one percent of the computing power, but happens to get lucky and finds six extra blocks in a row, before the rest of the network has found any extra blocks.

In this case, she might be able to get ahead, and get control of the block chain. But this particular event will occur with probability. Of course, this is not a rigorous security analysis showing that Alice cannot double spend. The security community is still analysing Bitcoin, and trying to understand possible vulnerabilities. The proof-of-work and mining ideas give rise to many questions.

How much reward is enough to persuade people to mine? How does the change in supply of infocoins affect the Infocoin economy? Will Infocoin mining end up concentrated in the hands of a few, or many? These are all great questions, but beyond the scope of this post. I may come back to the questions in the context of Bitcoin in a future post.

To use Bitcoin in practice, you first install a wallet program on your computer. You can see the Bitcoin balance on the left — 0. What you do is tell your wallet program to generate a Bitcoin address.

You then send your Bitcoin address to the person who wants to buy from you. You could do this in email, or even put the address up publicly on a webpage. This is safe, since the address is merely a hash of your public key, which can safely be known by the world anyway.

The person who is going to pay you then generates a transaction. Line 1 contains the hash of the remainder of the transaction, 7c This is used as an identifier for the transaction. Lines 3 and 4 tell us that the transaction has one input and one output, respectively. Line 6 tells us the size in bytes of the transaction. Lines 7 through 11 define the input to the transaction. In particular, lines 8 through 10 tell us that the input is to be taken from the output from an earlier transaction, with the given hash , which is expressed in hexadecimal as ae Line 11 contains the signature of the person sending the money, Again, these are both in hexadecimal.

This seems like an inconvenient restriction — like trying to buy bread with a 20 dollar note, and not being able to break the note down. The solution, of course, is to have a mechanism for providing change. Lines 12 through 14 define the output from the transaction. In particular, line 13 tells us the value of the output, 0. Line 14 is somewhat complicated. The main thing to note is that the string a7db6f You can now see, by the way, how Bitcoin addresses the question I swept under the rug in the last section: In fact, the role of the serial number is played by transaction hashes.

In the transaction above, for example, the recipient is receiving 0. There are two clever things about using transaction hashes instead of serial numbers. Second, by operating in this way we remove the need for any central authority issuing serial numbers. Instead, the serial numbers can be self-generated, merely by hashing the transaction.

Ultimately, this process must terminate. This can happen in one of two ways. This is a special transaction, having no inputs, but a 50 Bitcoin output. In other words, this transaction establishes an initial money supply. You can see the deserialized raw data here , and read about the Genesis block here. With the exception of the Genesis block, every block of transactions in the block chain starts with a special coinbase transaction. This is the transaction rewarding the miner who validated that block of transactions.

It uses a similar but not identical format to the transaction above. You can read a little more about coinbase transactions here. The obvious thing to do is for the payer to sign the whole transaction apart from the transaction hash, which, of course, must be generated later. Currently, this is not what is done — some pieces of the transaction are omitted. This makes some pieces of the transaction malleable , i. I gather that this malleability is under discussion in the Bitcoin developer community, and there are efforts afoot to reduce or eliminate this malleability.

In the last section I described how a transaction with a single input and a single output works. Line 1 contains the hash of the remainder of the transaction.

As in the single-input-single-output case this is set to 0, which means the transaction is finalized immediately. Lines 7 through 19 define a list of the inputs to the transaction.

Each corresponds to an output from a previous Bitcoin transaction. Line 11 contains the signature, followed by a space, and then the public key of the person sending the bitcoins.

Lines 12 through 15 define the second input, with a similar format to lines 8 through And lines 16 through 19 define the third input. The first output is defined in lines 21 and Line 21 tells us the value of the output, 0. The main thing to take away here is that the string e8c One apparent oddity in this description is that although each output has a Bitcoin value associated to it, the inputs do not.

Of course, the values of the respective inputs can be found by consulting the corresponding outputs in earlier transactions. In a standard Bitcoin transaction, the sum of all the inputs in the transaction must be at least as much as the sum of all the outputs.

The only exception to this principle is the Genesis block, and in coinbase transactions, both of which add to the overall Bitcoin supply. If the inputs sum up to more than the outputs, then the excess is used as a transaction fee. This is paid to whichever miner successfully validates the block which the current transaction is a part of. One nice application of multiple-input-multiple-output transactions is the idea of change. Suppose, for example, that I want to send you 0.

I can do so by spending money from a previous transaction in which I received 0. The solution is to send you 0. Of course, it differs a little from the change you might receive in a store, since change in this case is what you pay yourself. But the broad idea is similar. That completes a basic description of the main ideas behind Bitcoin. But I have described the main ideas behind the most common use cases for Bitcoin.

How anonymous is Bitcoin? Many people claim that Bitcoin can be used anonymously. This claim has led to the formation of marketplaces such as Silk Road and various successors , which specialize in illegal goods. However, the claim that Bitcoin is anonymous is a myth. The block chain is a marvellous target for these techniques.

I will be extremely surprised if the great majority of Bitcoin users are not identified with relatively high confidence and ease in the near future. Furthermore, identification will be retrospective, meaning that someone who bought drugs on Silk Road in will still be identifiable on the basis of the block chain in, say, These de-anonymization techniques are well known to computer scientists, and, one presumes, therefore to the NSA.

I would not be at all surprised if the NSA and other agencies have already de-anonymized many users. It is, in fact, ironic that Bitcoin is often touted as anonymous. Bitcoin is, instead, perhaps the most open and transparent financial instrument the world has ever seen.

Can you get rich with Bitcoin? I must admit I find this perplexing. What is, I believe, much more interesting and enjoyable is to think of Bitcoin and other cryptocurrencies as a way of enabling new forms of collective behaviour. But if money in the bank is your primary concern, then I believe that other strategies are much more likely to succeed. One is a nice space-saving trick used by the protocol, based on a data structure known as a Merkle tree. You can get an overview in the original Bitcoin paper.

You can read more about it at some of the links above. But this is only a small part of a much bigger and more interesting story. But the scripting language can also be used to express far more complicated transactions. To put it another way, Bitcoin is programmable money. In later posts I will explain the scripting system, and how it is possible to use Bitcoin scripting as a platform to experiment with all sorts of amazing financial instruments.

You can tip me with Bitcoin! You may also enjoy the first chapter of my forthcoming book on neural networks and deep learning, and may wish to follow me on Twitter.

In my legally uninformed opinion digital money may make this issue more complicated. At least naively, it looks more like speech than exchanging copper coins, say. The chapter explains the basic ideas behind neural networks, including how they learn. I show how powerful these ideas are by writing a short program which uses neural networks to solve a hard problem — recognizing handwritten digits.

The chapter also takes a brief look at how deep learning works. There are many malicious sites on the web, and you want your browser to warn users when they attempt to access dangerous sites. For example, suppose the user attempts to access http: An obvious naive way is for your browser to maintain a list or set data structure containing all known malicious domains.

A problem with this approach is that it may consume a considerable amount of memory. If you know of a million malicious domains, and domains need say an average of 20 bytes to store, then you need 20 megabytes of storage.

Is there a better way? The data structure is known as a Bloom filter. Most explanations of Bloom filters cut to the chase, quickly explaining the detailed mechanics of how Bloom filters work. Such explanations are informative, but I must admit that they made me uncomfortable when I was first learning about Bloom filters. And that left me feeling that all I had was a superficial, surface-level understanding of Bloom filters. In this post I take an unusual approach to explaining Bloom filters.

Rather, the benefit of developing Bloom filters in this way is that it will deepen our understanding of why Bloom filters work in just the way they do. But if your goal is to understand why Bloom filters work the way they do, then you may enjoy the post. Most of my posts are code-oriented.

This post is much more focused on mathematical analysis and algebraic manipulation: General description of the problem: We want a data structure which represents a set of objects. That data structure should enable two operations: One idea is to store hashed versions of the objects in , instead of the full objects. If the hash function is well chosen, then the hashed objects will take up much less memory, but there will be little danger of making errors when test ing whether an object is an element of the set or not.

We have a set of objects , where denotes the number of objects in. For each object we compute an -bit hash function — i. We can test whether is an element of by checking whether is in the set of hashes. This basic hashing approach requires roughly bits of memory.

The saving is possible because the ordering of the objects in a set is redundant information, and so in principle can be eliminated using a suitable encoding. A danger with this hash-based approach is that an object outside the set might have the same hash value as an object inside the set, i. In this case, test will erroneously report that is in. That is, this data structure will give us a false positive.

Fortunately, by choosing a suitable value for , the number of bits output by the hash function, we can reduce the probability of a false positive as much as we want. To understand how this works, notice first that the probability of test giving a false positive is 1 minus the probability of test correctly reporting that is not in. This occurs when for all. If the hash function is well chosen, then the probability that is for each , and these are independent events.

Thus the probability of test failing is:. This expression involves three quantities: To understand that we let be the number of bits of memory used, and aim to express as a function of and. Observe that , and so we can substitute for to obtain. This can be rearranged to express in term of and:. This expression answers the question we really want answered, telling us how many bits are required to store a set of size with a probability of a test failing. This makes intuitive sense: Because this happens with probability , it must be that occupies a fraction of the total space of possible hash outputs.

And so the size of the space of all possible hash outputs must be about. As a consequence we need bits to represent each hashed object, in agreement with the expression above. How memory efficient is this hash-based approach to representing? The big drawback of this hash-based approach is the false positives.

For example, false positives turn out to be okay for the safe web browsing problem. You might worry that false positives would cause some safe sites to erroneously be reported as unsafe, but the browser can avoid this by maintaining a small!

Suppose we want to represent some subset of the integers. As an alternative to hashing or to storing directly, we could represent using an array of bits, numbered through. We would set bits in the array to if the corresponding number is in , and otherwise set them to. The memory cost to store in this bit-array approach is bits, regardless of how big or small is.

Suppose, for comparison, that we stored directly as a list of bit integers. Then the cost would be bits. When is very small, this approach would be more memory efficient than using a bit array. But as gets larger, storing directly becomes much less memory efficient. We could ameliorate this somewhat by storing elements of using only 10 bits, instead of 32 bits.

But even if we did this, it would still be more expensive to store the list once got beyond one hundred elements. So a bit array really would be better for modestly large subsets. A problem with the bit array example described above is that we needed a way of numbering the possible elements of ,. In general the elements of may be complicated objects, not numbers in a small, well-defined range. Fortunately, we can use hashing to number the elements of.

Suppose is an -bit hash function. In particular, for each we set the th element in the bit array, where we regard as a number in the range. More explicitly, we can add an element to the set by setting bit number in the bit array. And we can test whether is an element of by checking whether bit number in the bit array is set.

This is a good scheme, but the test can fail to give the correct result, which occurs whenever is not an element of , yet for some. This is exactly the same failure condition as for the basic hashing scheme we described earlier.

By exactly the same reasoning as used then, the failure probability is. This works differently than for the basic hashing scheme, since the number of bits of memory consumed by the current approach is , as compared to for the earlier scheme.

Rearranging this to express in term of and we obtain:. When is small this can be approximated by. The only time the current approach is better is when is very, very large. To get some idea for just how large, if we want , then is only better than when gets to be more than about. In practice, the basic hashing scheme will be much more memory efficient. At this point, hashing into bit arrays looks like a bad idea. But it turns out that by tweaking the idea just a little we can improve it a lot.

How can we make the basic filter just described more memory efficient? One possibility is to try using multiple filters, based on independent hash functions. More precisely, the idea is to use filters, each based on an independent -bit hash function,.

So our data structure will consist of separate bit arrays, each containing bits, for a grand total of bits. We can add an element by setting the th bit in the first bit array i. We can test whether a candidate element is in the set by simply checking whether all the appropriate bits are set in each filter.

For this to fail, each individual filter must fail. Because the hash functions are independent of one another, the probability of this is the th power of any single filter failing:. The number of bits of memory used by this data structure is and so we can substitute and rearrange to get. Provided is much smaller than , this expression can be simplified to give.

This repetition strategy is much more memory efficient than a single filter, at least for small values of. For instance, moving from repetitions to repititions changes the denominator from to — typically, a huge improvement, since is very small.

And the only price paid is doubling the numerator. So this is a big win. Intuitively, and in retrospect, this result is not so surprising. Putting multiple filters in a row, the probability of error drops exponentially with the number of filters.

By contrast, in the single filter scheme, the drop in the probability of error is roughly linear with the number of bits. So using multiple filters is a good strategy. For larger values of the analysis is somewhat more complicated. This is a variation on the idea of repeating filters. Instead of having separate bit arrays, we use just a single array of bits. When add ing an object , we simply set all the bits in the same bit array. To test whether an element is in the set, we simply check whether all the bits are set or not.

Failure occurs when for some and , and also for some and , and so on for all the remaining hash functions,. These are independent events, and so the probability they all occur is just the product of the probabilities of the individual events.


4.8 stars, based on 139 comments
Site Map