The algorithm still doesn't have a name. I have been calling--its codename is angel, because thats what I happened to name the file, but the project also has files named things like Princess and Dutchess. And it is actually called angel2, because angel() turned out to be super inefficient. So...whatever.
The algorithm definitely runs in polynomial time for a non-empty subset of 3-SAT problems, and it is definitely correct for all of the SATLIB problems I tried it on, which is everything size 75 and under. Yeah, I know, not that impressive, but I get really bored sitting around waiting for it, and I can't reboot into my starcraft partition* while my computer is churning away.
It is definitely not linear or sub-linear (sorry for the typo last time). It has a runtime that rises quickly enough to make 125-variable problems take too long for my attention span.
Next: Math vs Science. My definition of math is that it is universal and irrefutable, like abstract geometry and...lets skip my opinions on math vs science. Anyway, since I don't understand exactly why my algorithm is working and only have a fuzzy idea. Trying to prove that it both a) solves all 3-sat instances and b) is polynomial-time for all 3-sat instances would be difficult, time consuming, and ultimately a waste of time in the likely event that I prove myself wrong, or, even worse, I never prove myself wrong and it becomes an unending time sink.
Instead, lets try a different direction. I know that if I made a big "HEY GUYS I SOLVED P=NP" anouncement like all the other tools, people--well in reality they would laugh and ignore me but in theoretical land someone would read my paper that took forever to write because I used LaTeX and then send/publish an instance of 3-sat that breaks my algorithm. Why don't I just try to find such an instace myself? My algorithm is hardcoded to halt with a result of "EPIC FAIL" if it exceeds what I think are polynomial bounds. If I can find an instance that causes it to do that, game over.
I have a rough idea of which values of n (number of variables) and m (number of clauses) are the most likely to break it; the self-imposted polynomial safetys are based on log_2(n), so for all problems with n equal to exponents of 2, this safety feature will be the least lenient. This is unfortunate, though, because n=32 is a waste of time (the algorithm solves it before getting to the interesting part) and n=128 takes like more than an hour, leaving n=64 the only one I feel like watching. Admittely, for all I talk about how boring it is to watch, I should be getting a server that I can leave running. That will happen as soon as I find the moving box that contains the hard drive with my AWS keys on them. There is also a USB key...but that is also in a box. Anyway.
Onto m (number of clauses). A lot of google hits I'm getting for hard 3-sat problems all want to talk about this phase transition that happens around 4.24 (that is, where m = n * 4.24) or maybe 4.6 or some other number depending on the paper. People are interested in this value because when you take random samples of instances, most of them are satisfiable with a factor < 4.24 and most are unsatisfiable with a factor > 4.24 and right around 4.24, its like, totally up for grabs. Many papers have insisted that the hardest problems are at 4.24 (at least for DPLL and WALKSAT). This conflicts with my own test data. In fact, I belive that, for a given n, the hardest ones are unsatisfiable instances with the lowest factor possible and in which every variable is represented by at least one literal (otherwise, you are basically not using variables and it is technically a smaller problem). While I was instrumenting my code for batch testing, I ran some tests. Can you insert tables in this thing? Hm I can insert a picture, a video, or something called a jump break. ::sigh:: oh google, why can't you work kinda like Evan's blog but different in the exact way I'm picturing in my head but am too lazy to code?
Each "test" was a batch of 300 problems each with size n=64. I varied the factor x, where m=x*n, and measured the entire user/kernel (not wall clock) time it took to measure the entire batch of 300. Heh heh. 300. Didn't even think of that until now. That's awesome. Yes, it was just these three hundred clauses, the honor guard of the expression false, which defended the free world from the Persians, and from idiots on the internet who argue which inherently wrong wall-clock-time method should be used to measure code, all of which would generate shitty results when they get bored and start watching The Guild during their test because they don't understand why Felicia Day is so popular. Yes...these three clauses. Oh heroic (v_69, v_42, !v_42)...you were the first to go!
Anyway.
Oh yeah, initial test results. Fuck I wish I got paid to do this.
- factor (# satisfiable, # unsatisfiable), time in seconds
- 2.9 (0,300) 112s
- 2.8 (0,300) 128s
- 2.7 (1,299) 141s
- 2.6 (oops) 167s
- 2.5 (14,286) 195s
- 2.4 (42,258) 238s
- 2.3 (77,223) 265s
- 2.2 (139,161) 314s
- 2.1 (202,98) 272s
- 2.0 (261,39) 180s
I'm not ashamed that I researched in a vacuum (did not read much of other people's work) because I was doing this for fun and the part I enjoyed was inventing an algorithm on my own, however, now the vacuum part is coming back to bite me in the ass. I can't read the super technical papers because understanding them depends on understanding all the papers they depend on (I mean, cite...) and none of the easy ones say anything terribly interesting. And I think I'm reading drafts. We have vocational schools that teach you how to cut hair or fix motorbikes. Why can't there be one that covers all of the existing research on NP-Completeless focusing exclusively on 3-SAT and not even spending much time on k-SAT?
What are my next moves? Realistically, I need to get an EC2 server (or some other computer than can run 24/7, but EC2 instances are better because it should always be the same hardware). I need to come up with a thing to measure...either counts of a particular operation, or the sizes of sets or something, so I can stop relying on timing, which is machine dependant.
Next then, I supposed whatever I do will involve running shit-ton of 3-sat instances. The number of possible instances for given values of n and m I believe is something like (n^3)!/( m!(n^3 - m!), where n^3 is way higher than m (not >> higher, just way higher). How many should I run? Enough to convince me, and other people, that it is worth the time to analyze this thing.
First off, what I've seen of solvable 3-sat instance generation is pathetic. People pick a solution and then generate clauses that don't violate the solution. There are some tricks people have done to make this process suck less, but it is basically worthless, especially given my hard-to-describe learnings that I...learned while staring at a hard-to-explain multi-dimensional reduction of 3-SAT constructed out of soap and bandaid boxes from 7-11. Those corners man...the corners killed me.
If you are testing an algorithm I don't understand why you can't just generate a bunch of instances, and then see which ones are SAT ? Once you have an alleged solution, its polynomial time...possibily even linear time to test the solution. Well, anyway, testing SAT instances is out for now. Besides, if we could prove than an algorithm can solve all UNSAT instances in polynomial time, it doesnt matter how long SAT instances take. SAT instances can take until the end of this and any other unverse for all I care; if you know that UNSAT is polynomial time, just stop your algorithm after it has exceeded polynomial time and declare the instance SAT. In fact, once you do that, you can start manipulating the problem (start picking values for variables) and then asking your algorithm if it is still SAT. Combine a polynomial number of such manipulations with your polynomial algorithm and now you have a polynomial algorithm to give you the solution for SAT instances. Wow I think what I just wrote could be turned into a proof. Anyway.
UNSAT. Verifying that an instance is UNSAT may be difficult if you don't feel like printing out a minimal proof tree (I really don't). However, I believe it is far easier to prove that my algorithm is correct than it is to prove why the hell it runs in polynomial time, so without loss of generality, lets assume that I have a girlfriend...oh that didn't work. Fine then lets assume that my algorithm is correct. I'll make it print out the proof later.
Here are some ideas about how to constrain the random samples to be as difficult as possible:
- no tautologies, clauses with both a literal and its negation
- no duplicate clauses
- does not contain a "trivially unsat cluster" -- the group of eight possible distinct clauses over the same three variables
- every(both) literal of every variable is expressed in at least one clause
I could set up a search for such instance, and be found asleep at my desk bathed in the monitor light of a flickering screen with headphones blaring by a troupe of fellow associates asking me for an illegal and oddly shaped diskette, whom I follow to a club because the girl has a tattoo of a white rabbit.
Or I could just make some pretty graphs.
Ooh...here is another intersting test: start with an empty set. Randomly pick a clause based on n variables, add it to the set, and run angel2 on it. Add another clause and run angel2 on it again. Repeat by adding 1 randomly selected clause and running angel again until you get UNSAT. Ok what would this accomplish...I'm not sure. Well, actually you could continue adding clauses randomly until you've added all clauses, and record how long each instance takes to solve. Then you would have a pretty graph! You could turn this into sort of an exhaustive search...but you might be better off naming every gluon in the universe. I have a pet gluon. His name is Andy. My preivous pet gluon Bernard ran off with some slutty quark and I never saw him again.
*I've gotten the impression that Microsoft employees do not appreciate me calling it that. The truth hurts.

0 comments:
Post a Comment