I'm building a price comparison engine which, for now, will only be using nicely formatted XML datafeeds.
The feeds lack any sort of universal product identifier (like a Manufacturer code or something) so I have pre-parsing routines that format the product name to something standard. I've been plowing through this project and I only realized now that I don't know how I should go about matching up the products and indicate they're matched.
Right now, I prep the feeds as arrays, so I figured I could just do the comparison when I have all the data sitting in the arrays, but I'm realizing that the methodology to do so is more complicated than I originally thought.
What is the most efficient way to find the duplicate entries? The array_unique function, as I understand it, won't help me because it only returns a list of unique elements, it doesn't tell me that SKU 341434 @ Vendor1 is the same as SKU u8349 @ Vendor2.
Thanks for your help!
The feeds lack any sort of universal product identifier (like a Manufacturer code or something) so I have pre-parsing routines that format the product name to something standard. I've been plowing through this project and I only realized now that I don't know how I should go about matching up the products and indicate they're matched.
Right now, I prep the feeds as arrays, so I figured I could just do the comparison when I have all the data sitting in the arrays, but I'm realizing that the methodology to do so is more complicated than I originally thought.
What is the most efficient way to find the duplicate entries? The array_unique function, as I understand it, won't help me because it only returns a list of unique elements, it doesn't tell me that SKU 341434 @ Vendor1 is the same as SKU u8349 @ Vendor2.
Thanks for your help!