An optimized algorithm came to me while I was showering before work:
Imagine a symbolic pattern matcher with a six character alphabet. A typical implementation might store its history in a string like so:
Character Index->0000000000111111111122222222223333 0123456789012345678901234567890123 |||||||||||||||||||||||||||||||||| BFFADEDCEFBEDADCFCCFEEADDEBFACCEEA <-New characters appended here.
These strings are best searched using Java string libraries.
Now imagine a "folded" string that is folded at each changed symbol. Besides characters linked in a string, identical symbols could be linked one to the next using a linked list or similar data structure like so:
Character Index->0000000000111111111122222222223333 0123456789012345678901234567890123 |||||||||||||||||||||||||||||||||| A<--------A<-------A<----A<---A<-A index B<--------B<--------------B<--------B index C<------C<CC<---------CC<----C index D<D<----D<D<-------DD<----------D index E<-E<-E<-------EE<--E<----EE<--E index FF<-----F<-----F<-F<------F<-------F index ^ | New characters appended here.
To append the next character (I'll say 'A'):
Now you're ready to match a pattern:
There are two nice outcomes from this approach. First, it's fast. Most pattern matching algorithms aren't aware of their inputs before they receive them so they can't take advantage of indexing until they've already passed through the pattern and history at least once. We get around this by indexing as the string is built. Second, you don't have to guess at the length of your maximum match. The algorithm explores the absolute depth of every existing one-character match.
I've included a prototype implementation below. It comes with no guarantees. There are more than a few things to iron out. Namely:
Feel free to use and improve the code but please return the results to the wiki. Cheers. --Corbos
import java.awt.geom.Point2D; public class FoldedPatternMatcher { private PatternNode[] _nodeIndexes; private PatternNode _head; private int _scanCount = 0; final double TEN_DEGREES = 10 * Math.PI / 180; final double TWENTY_DEGREES = TEN_DEGREES * 2; final char BREAK_SCAN = '\uffff'; public FoldedPatternMatcher(){ _nodeIndexes = new PatternNode[512]; } public void record(double velocity, double headingDelta){ int index = (((int)((velocity + 8) * 15 / 16) << 5) | ((int)((headingDelta + TEN_DEGREES) * 31 / TWENTY_DEGREES))); PatternNode node = new PatternNode((char)index, _scanCount++); insert(node); node.IndexRef = _nodeIndexes[index]; _nodeIndexes[index] = node; } public void insertBreak(){ insert(new PatternNode(BREAK_SCAN, _scanCount++)); } private void insert(PatternNode node){ node.SequenceRef = _head; _head = node; if(node.SequenceRef != null){ node.SequenceRef.ReverseSequenceRef = node; } } public Point2D.Double project(Point2D.Double enemyLocation, double currentHeading, double bulletVelocity, Point2D.Double robotLocation){ Point2D.Double nextPosition = null; PatternNode indexSymbol, cursor; PatternNode lastSymbol = _head; PatternNode bestSymbol = null; indexSymbol = _head.IndexRef; int depth; int bestDepth = 0; while(indexSymbol != null && lastSymbol.Index - indexSymbol.Index < 55000){ if(lastSymbol.Index - indexSymbol.Index > 20){ cursor = indexSymbol; depth = 0; while(lastSymbol != null && cursor != null && lastSymbol.Symbol == cursor.Symbol && depth < 125){ lastSymbol = lastSymbol.SequenceRef; cursor = cursor.SequenceRef; depth++; } if(depth >= bestDepth){ bestSymbol = indexSymbol; bestDepth = depth; } } lastSymbol = _head; indexSymbol = indexSymbol.IndexRef; } if(bestSymbol != null){ double projectedTime = 0; double enemyX = enemyLocation.x; double enemyY = enemyLocation.y; bestSymbol = bestSymbol.ReverseSequenceRef; while(Point2D.distance(robotLocation.x, robotLocation.y, enemyX, enemyY) > projectedTime * bulletVelocity && bestSymbol != null){ currentHeading += bestSymbol.getHeading(); enemyX += Math.sin(currentHeading) * bestSymbol.getVelocity(); enemyY += Math.cos(currentHeading) * bestSymbol.getVelocity(); bestSymbol = bestSymbol.ReverseSequenceRef; projectedTime++; } nextPosition = new Point2D.Double(enemyX, enemyY); } return nextPosition; } class PatternNode{ public PatternNode IndexRef; public PatternNode SequenceRef; public PatternNode ReverseSequenceRef; public char Symbol; public int Index; public PatternNode(char s, int index){ Symbol = s; Index = index; } public double getVelocity(){ int i = (int)Symbol; i = i >> 5; return (((double)i) * 16 / 15) - 8; } public double getHeading(){ int i = (int)Symbol; i = i & 31; return (((double)i) * TWENTY_DEGREES / 31) - TEN_DEGREES; } } }
Cool idea! I like it when people publish their brainstorms like you just did. -- Vic
I've read the text a few times now. This seems very interesting, but it's somehow elusive for me yet. Can you provide some more wordings on how you do the match? -- PEZ
Sorry about my clarity. I was in a hurry to get the idea down and realized my explanation was lacking. I rewrote a chunk of the text above. Take a look and see if it makes more sense. --Corbos
Thanks! It now makes sense to me. It even itches some in my coding fingers. =) Seriously, I might try do something with this. Some of my PM experiments have been put to a halt because of performance issues. This might be just what the doctor ordered! -- PEZ
Hmm, your explination doesn't really make sense to me, but then again I had trouble wrapping my head around the workings of dynamic clustering there for awhile. I think I should look at more normal pattern matching. -- Chase-san
My explanation might be a bit unintuitive (given that a smarty like PEZ asked for clarification). I'm not the best at explaining but it helps if you're comfortable with some of the PatternMatching metaphors/ideas:
Frame == a snapshot of your opponent's state(velocity, heading, deltas, etc...) at a specific point in time.
Movie == a collection of Frames stored in sequence.
Standard pattern matching takes the last few real-world frames and matches them against the closest set of historical frames and then "plays the movie forward" from the match (since you've gathered everything you need to reconstruct movement) until the estimated bullet impact. From that it determines a bearing and aims.
Symbolic pattern matching (SymbolicPatternMatcher) translates one or more continuous values from each frame, like doubles or floats, to discrete values. Usually the values are part of some symbol alphabet so each frame is simply a symbol (character). It's kind of like segmentation: 0-2 == 'A', 2-4 == 'B', 4-6 == 'C', or whatever. Then, during the matching phase, you don't have to use an error threshold to decide if you have a match or not. You simply check if symbols match. A nice side-effect is there's been a lot written about pattern matching in strings, which your movie is now, effectively: https://www-igm.univ-mlv.fr/~lecroq/string/.
Folded pattern matching is simply folding your movie so every 'A' lines up with or points to the last 'A' witnessed, every 'B' points to the last 'B' witnessed, etc. They're indexed. If you see a new 'A', you don't have to search your string for it. Instead, you just follow the 'A' index and see if any exist. Saves lots of time. --Corbos
I think I got it, each A has a index to the last A before it, and such that you can just suffle backward through the list? -- Chase-san
AH HA, I understand this perfectly now, I can't believe I was so dense. By having references to the last of the letter you can more easily compare forward and backward from those positions to get a quicker match, instead of having to scan through the entire list looking for the positions then scanning. This gives me an idea to make a Folded Dynamic Clustering gun, which would be complex (as there is so much viaration, but you could "cluster" simular entries =) ). In fact if I have the time, I just go ahead and make this (as soon as I can figure out a good way to cluster simular entries without diffusing to much and just having one big cluster). --Chase-san
You might simply run out of memory if you tried to apply this concept to a DynamicClustering gun. Also, you might want to look into [K-means clustering], the idea that DynamicClustering is based off of, to get some ideas. Frankly, I'm not really sure how you could "fold" a DC system in the same way - the DC system is already kind of a simpler version of K-means. But good luck =) -- Voidious
Yah, I thought it over for awhile, I figured you would require static cluster reference points to allow a folded DC system, which is nothing more then a gf based index pattern matcher (which itself is nothing more then a complex gf gun), but 10 times more complex. So I decided to test a few shortening tricks to current DC systems in an attempt to speed them up. --Chase-san
Also, being new to PatternMatching in general. Do you think a indexed folded pattern matcher would be possible. I don't know if this would even count as a "folded" pattern matcher.
You use the index to store the locations of each index (e.g. distance, velocity, etc). Store the heading and (velocity or distance) changes in a symbol within the node itself. Then when you get a new node to repalce the last within that index you just place it as the lastSequenceNode?, so that you reference not to the last Symbol of the same kind but the same index. Then you use the symbols to compare.
You've probably fallen victim to my bad naming conventions. In the example above, SequenceRef? points backward in time. That is, tick 4 points at tick 3, tick 3 at tick 2, 2 at 1. Following ReverseSequenceRef? actually "plays the movie forward" like you describe: 1 -> 2 -> 3 ...
As for an indexed folded PM, that's really what this is all about. The symbols are indexed by "folding" the movie. If I had any sense, I would have called it an IndexedPatternMatcher?. However, as evidenced by the SequenceRef/ReverseSequenceRef? debacle, I'm often without sense - especially when it comes to naming. I pictured a folded string of characters and the name stuck in my head. Cheers. --Corbos
Then that means that within PatternNode?, SequenceRef? points backwards along the movie, and ReverseSequenceRef? points forward(that seems backwards to me, but meh), and IndexRef? points to the last of the Symbol? If thats so your code makes alot more sense to me now. Also may I ask, why do you bothered convering the symbol to a char, an int and a char are almost synchronis in java, and the way you compare them wouldn't make much difference (in fact Elipse lets me compare them without a cast).
On mine, by a index I mean Node[] index = new Node[DISTANCE_INDEXES]; and I wouldn't keep a symbol index at all (Since its not used in the comparing system at all I don't need it). I wouldn't even need to keep distance index data within the node as thats built into the index.
You're right that you can just use numbers... I'm guessing symbols are used here for the sake of comparing it to a vanilla SymbolicPatternMatcher (near the top of the page), in which using characters allows you to make use of Java's string matching methods. -- Voidious