RegEx ~ not grabbing src - need help

simpleonline12

New member
Sep 29, 2009
191
3
0
Hey guys I'm trying to setup my RegEx to grab the link of <IMG> SRC tags.

Right now my code doesn't do anything when I have it setup this way.

Code:
Public Class Form1

    Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click
        Dim net As New Net.WebClient()
        Dim src As String
        src = net.DownloadString("http://www.wikihow.com/Make-Easy-Homemade-Biscuits")

        ' Create a match using regular exp<B></B>ressions
        Dim m As Match = Regex.Match(src, "")

        ' Spit out the value plucked from the code
        RichTextBox1.Text = m.Value


    End Sub
End Class

Any ideas on how I can setup my regex to grab the link of the src?

Example:

src="this is my image"
 


Try this ...

Code:
Private Sub Button1_Click(sender As System.Object, e As System.EventArgs) Handles Button1.Click
        Dim net As New Net.WebClient()
        Dim src As String
        src = net.DownloadString("http://www.wikihow.com/Make-Easy-Homemade-Biscuits")

        Dim matches As MatchCollection = Regex.Matches(src, "<img[^>]*src=""(.+?)""")

        For Each match As Match In matches
            RichTextBox1.AppendText(match.Groups(1).Value + vbCrLf + vbCrLf)
        Next
    End Sub
 
I recommend using an online regex parser if you get stuck writing a regex. It's helped me in the past a ton: RegExr

Depending on what your exact need is, you could get away with scraping the images out with a pretty dirty one-liner. I'm sure this could be 1000x more polished and might not cover everything, but looks like it'd do the trick:

curl -so- http://www.wikihow.com/Make-Easy-Homemade-Biscuits|egrep -o "src='.*[^js]'"|cut -c 5-|xargs -n1 wget
 
Last edited: