Code Comments
Programming Forum and web based access to our favorite programming groups.Normally I don't like to come here, a couple people here are not nice people, but regardless I am trying to figure a way to create a statistical occurrence table and check for compressibility based upon the results I have. I have 2 different possible outcomes with an equal occurrence chance for each outcome statistically. This much I have proven, but stress, fatigue, and 3 jobs is killing my ability to try to proceed a little further.... thank God one of the jobs ends next w.... Anyhow each outcome might seem to result in gibberish, but my method actually has them sorted successfully. Guessing wont be useful, I just wont reply unless you somehow get an exact dead on of my method. Lets just say I want to keep it in the closet just a little more longer. But no loss is involved, some minor growth is included, but statistically it is extremely low. RESULT #1 00 01 0 0 1 1 1 1 Result #2 000 001 00 01 01 01 0 1 1 1 1 1 1 1 While they are balanced in total 1's versus 0's in each method, it seems to me that the longer length of some of the outcomes means we might be able to obtain compression. It "feels" as if there is patterns attackable with one of a variety of methods, yet proving it is a bit beyond my time/energy at this moment. Anyone care to assist me with this? btw result #3, with this system was funny, 0 and 1.. exactly :P I have considered re-rerunning my system, have considered using variants, but I would love to see the results from this if anyone has spare time and some curiosity.
Post Follow-up to this messageOn Nov 22, 5:38 am, Einstein <michae...@gmail.com> wrote: > Normally I don't like to come here, a couple people here are not nice > people, but regardless I am trying to figure a way to create a > statistical occurrence table and check for compressibility based upon > the results I have. > > I have 2 different possible outcomes with an equal occurrence chance > for each outcome statistically. This much I have proven, but stress, > fatigue, and 3 jobs is killing my ability to try to proceed a little > further.... thank God one of the jobs ends next w.... > > Anyhow each outcome might seem to result in gibberish, but my method > actually has them sorted successfully. Guessing wont be useful, I just > wont reply unless you somehow get an exact dead on of my method. Lets > just say I want to keep it in the closet just a little more longer. > But no loss is involved, some minor growth is included, but > statistically it is extremely low. > > RESULT #1 > 00 > 01 > 0 > 0 > 1 > 1 > 1 > 1 > > Result #2 > 000 > 001 > 00 > 01 > 01 > 01 > 0 > 1 > 1 > 1 > 1 > 1 > 1 > 1 > > While they are balanced in total 1's versus 0's in each method, it > seems to me that the longer length of some of the outcomes means we > might be able to obtain compression. It "feels" as if there is > patterns attackable with one of a variety of methods, yet proving it > is a bit beyond my time/energy at this moment. > > Anyone care to assist me with this? btw result #3, with this system > was funny, 0 and 1.. exactly :P > > I have considered re-rerunning my system, have considered using > variants, but I would love to see the results from this if anyone has > spare time and some curiosity. hi there, I've been analyzing the method you're presenting, and as you're doing statistical analysis per bit, you'll end up with data corruption, with the probability being, of course, the inverse of the successful prediction ratio. if you boost the prediction ratio up, awesome, but consider that 99% success means 1 of 100 bits will be corrupted. if the corrupted bit cannot be specifically marked as corrupted (increasing the total transmit bit count by 100%) 100% of the time, then all the sudden you're no longer doing lossless encoding, and are instead working with lossy formula. without a 100% method of lossless encoding, you aren't going to be able to fold the data layers over top of each other, thus you lose the composite construction, blasting your n'th dimensional method back to 2d, and a lossy 2d at that. by no means am I proposing that your research is fruitless, as you'll come to really interesting conclusions and starting points for other, more intricate and sublime methodologies following from understanding your path here. however, it's your projections into the static data stream that allows you to fill in the missing information, and this isn't taking place at a quantum level. figure out how to properly *generate* the curves that you need to align the statistical mechanics with the patterns you're seeing, and you'll be one step closer to your goal. chris.
Post Follow-up to this messageThanks for replying Chris, and for the thought out post. However fortunately this is not a predictive stream, this is actual result outcomes from the system I have set up. I will say this is only a portion of the input, the rest is sitting in an identified length subsection and is not being compressed 'at this moment'. This factual outcome results is the unique aspect of my system. I need an actual probabilities table written up on this asap. Sadly no time :(
Post Follow-up to this messageOn Nov 23, 2:44 pm, Einstein <michae...@gmail.com> wrote: > Thanks for replying Chris, and for the thought out post. > > However fortunately this is not a predictive stream, this is actual > result outcomes from the system I have set up. I will say this is only > a portion of the input, the rest is sitting in an identified length > subsection and is not being compressed 'at this moment'. This factual > outcome results is the unique aspect of my system. I need an actual > probabilities table written up on this asap. Sadly no time :( the next gate is the injection portion.. it would seem to be possible to properly justify the positional portions of compressed and uncompressed sections, yet you're looking at it lkie tihs. as yuor mnid can poreprly deodce the positions of the data from a permuted source, intuitively it'll seem like this is also possible at the data compression end. the actuality is not so much, unless again you're able to isolate the curves that are providing the rearrangement. all this combined means that if you have to specifically mark the portions as being 'compressed' and 'uncompressed', then you'll have to use block sizes of much greater than a single bit/byte. at which point you need to make sure that the data is able to be aligned without losing all the benefits from the compression method. just be aware that your mind is capable of 'livening up' any data source by providing fractal cohesion, and this is the human element. as soon as you reduce the human element out of the equation by using a fully permuted data stream of k elements in base t, you have to really understand why some things seem intuitively possible, probably, likely, and necessary. i think the reason you're getting improper reflections from this group is that your system isn't fractally cohesive.. what you have going on simply doesn't work losslessly, and this is a lossless compression forum. post the logical arguments, say exactly the position you're at making this thing work, and we can probably figure out if it's possible, and learn a lot in the process even if it's not. if you're totally honest with your approach, you'll find it a lot easier to get a true look at where your system is loosing cohesion. chris.
Post Follow-up to this messageSir I can, and would, put a $10,000 bet that this is the outcome of the method "Statistically". The math on that is irrefutable. Additionally I can, and would place a bet of $10,000 that it is lossless. I will not be posting methods til I have had time to check the results out, determine the statistical chances of compression, and and see if the level of compression on this portion outweighs the increase in size needed to separate the portion out. Sadly I just had 3 more projects dropped on my lap with a required to be finished by Monday... on top of a pre-existing 3 projects. And before you ask sir, yes it is possible to return the data entirely to the original code with no loss. I will write out a basic version of the statistical layout when I can later. Every outcome of 2 groups. I am desiring every of say 16 possible combinations in sequence, which would of course be the best statistical model to generate a probabilities table off of, and therefore to also start examining deviations from the statistical norm. This of course would be a rather large outcome, but the information gathered would be priceless for understanding if there could be compression based upon the results therein.
Post Follow-up to this messageok time to do some basics 0000 0001 000 000 001 001 001 001 0100 0101 010 010 011 011 011 011 000 001 00 00 01 01 01 01 000 001 00 00 01 01 01 01 100 101 10 10 11 11 11 11 100 101 10 10 11 11 11 11 100 101 10 10 11 11 11 11 100 101 10 10 11 11 11 11 there are a simple version of 'statistical outcomes' if we placed a random pair next to each other, using the 8 outcome version. 0000 x 1 0001 x 1 0100 x 1 0101 x 1 010 x 2 000 x 4 011 x 4 100 x 4 101 x 4 00 x 4 001 x 6 01 x 8 10 x 8 11 x 16 64 total outcomes when viewed as direct results, however when we look at 00, 01, 10, 11 results... well we can't... statistically we don't have a large enough sample if we wish to just run it as existing length, unless we add a system to account for our non-even length issues. However every time I try to make a model to account for that, I have issues. So my goal is to run it at greater than 16 bits in length and to just do a lesser version to account for the non-even lengths. I am probably missing something rather easy to do to generate the needed information with a basic series, but I am so tired anymore, so much work... this is the sort of help I was hoping to find, so far 5 usenets and no luck. Of course each result is a probability, and I am at this moment looking for statistical outcomes. Mostly I am relying upon the facts that in very large files the total number of 1's and 0's tends to be a neutral amount, within a specific identifiable range, thus I can count that 'usually' I should have a fairly even disposition of any possible string size up to 10 bits in length among all possibilities of that string, within a certain allowable margin of error. Once knowing the statistical 'fallout' after engaging my algorithm I would know each individual lines compressibility, chance of occurence, and then be able to draw a conclusion about if it is compressible, under what methods it is compressible, and if it is worth trying due to the minor size increase I experience to track initial file size and the size after the algorithm has enacted, and 2 bits to identify no algorithm, version 1, version 2, or version 3 (3 being the one which comes out statistically tied). Also the results will tell me if I need to work with a lower, or higher base inside the algorithm for a specific part of the formula for the algorithm.
Post Follow-up to this message> I will not be posting methods til I have had time to check the results > out, determine the statistical chances of compression, and and see if > the level of compression on this portion outweighs the increase in > size needed to separate the portion out. indeed, this is what i'm talking about. in experiments of my own carried out to their full, this was the final integration that, at best, led me to equality, zero sum. i would suggest thinking very deeply about why exactly you're needing to seperate portions out, instead of creating a pointer to the data from a generator curve.. you're capable of this level of insight if you've got this far. consider how quantization works. check out the bases operations on each other. it's quite neat, and they're capable of providing plenty of data with very little set up, you just need to allocate a logarithmic pointer into the stream. :) chris.
Post Follow-up to this message
Show a Printable Version
Email This Page to Someone!
Receive updates to this thread
Powered by vBulletin
Copyright 2000-2006 Jelsoft Enterprises Limited.