Adding or sharing a link with a snapshot of a web page is a very useful feature in Microsoft Teams or in Zoom chat. This gives a user a way to quickly look at the web page snapshot and a small description about the link.
Go has an amazing library called chromedp which allows you to perform some amazing tasks like taking screenshot, fill out a form and submit it, send key events, device emulation, script evaluation etc. You can look at the complete example list here.
We are interested in taking a screenshot of the web page for the given URL. Chromedp allows you the take a screenshot of a specific element of the page or for the entire browser viewport.
The use case I am having is to capture just enough of the web page to give an idea to the user about the website. Looking at the screenshot and recalling whether one has visited this link or not is easier than looking at the link. Here is a sample code which will let you snap a screenshot of a given web page.
package main import ( "context" "crypto/rand" "fmt" "log" "os" "time" "unsafe" "github.com/chromedp/cdproto/emulation" "github.com/chromedp/chromedp" ) const ( UserAgentName = "Websnapv1.0" Path = "images\\" Timeout = 15 ) // https://stackoverflow.com/questions/22892120/how-to-generate-a-random-string-of-a-fixed-length-in-go func GenRandStr(n int) string { var alphabet = []byte("abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ") b := make([]byte, n) rand.Read(b) for i := 0; i < n; i++ { b[i] = alphabet[b[i]%byte(len(alphabet))] } return *(*string)(unsafe.Pointer(&b)) } func Snap(url string) (string, error) { ctx, cancel := chromedp.NewContext(context.Background()) defer cancel() // Creating timeout for 15 seconds ctx, cancel = context.WithTimeout(ctx, time.Second*Timeout) defer cancel() name := GenRandStr(12) var buf []byte var file = Path + name + ".png" err := chromedp.Run(ctx, emulation.SetUserAgentOverride(UserAgentName), chromedp.Navigate(url), //chromedp.FullScreenshot(&buf, 100), chromedp.CaptureScreenshot(&buf), ) if err != nil { log.Fatalf("ERROR:webext - Unable to extract meta(s) from the given URL - %s\n", url) return "", err } if err := os.WriteFile(file, buf, 0o644); err != nil { log.Fatalf("ERROR:webext - Cannot create snap for the link - %s\n", url) file = "" return "", err } return file, nil } func main() { i, err := Snap("https://medium.com/@prashantkhandelwal/marvel-comics-universe-graph-database-7264381478e1") if err != nil { log.Fatalf("ERROR: Unable to retrieve screenshot - %v", err.Error()) } // print the image path fmt.Println(i) }
Here, I have some const
declared which are easy to understand. Then I have a random string generator which I found on Stackoverflow to generate random image names. In the end I have a function snap
which takes a screenshot of the web page. Overall, this function is very simple to understand but I want you to pay special attention to this part of the function where chromedp
is used.
err := chromedp.Run(ctx, emulation.SetUserAgentOverride("Bindv1.0"), chromedp.Navigate(url), chromedp.CaptureScreenshot(&buf), )
Now the CaptureScreenshot
is a function which will capture a part of the web page. This function accepts the pointer to the buffer to which it writes the output eventually writes to an image file. Here is an example output:
The next useful function that you can also use is called FullScreenshot
. This function will let you capture the entire web page as an image. You can use this function with your functional tests to check how your web page looks when accessed from a different location or with different parameters. This function takes two parameters, the first is the pointer to the buffer, just like the one before and the second one is the quality
of the image. As the screenshot is for the entire viewport or the web page, I am uploading an image displayed in an image viewer program to give you a perspective of the screenshot. Here is an example output:
With screenshots done, let’s also get some basic information about the web page as well. Let’s create a new method and name it ExtractMeta
. It accepts URL of the web page as a parameter and returns a pointer to the WebData
struct which holds value for Title
and Description
of the web page. This function looks exactly like the Snap
function except for a slight change in the Run function usage and some variable declarations to hold returned values. Here is the code for extracting the metadata information:
var pageTitle, description string var w = &WebData{} err := chromedp.Run(ctx, emulation.SetUserAgentOverride(UserAgentName), chromedp.Navigate(url), chromedp.Title(&pageTitle), chromedp.Evaluate(`document.querySelector("meta[name^='description' i]").getAttribute('content');`, &description), ) w.Title = pageTitle w.Description = description
Notice that the Run
function has additional parameters chromedp.Evaluate
and chromedp.Title
. The chromedp.Title
returns the title of the web page. The chromedp.Evaluate
function lets you evaluate or execute a JavaScript
on the web page it is visiting and return the result so you can use it. For our use case, which is to get the description of the web page, we can execute the document.querySelector
on the meta
tags of the web page where the meta tag name
equals to description
. The i
is the case-insensitivity qualifier here. Add the below code to the main
function:
w, err := ExtractMeta("https://medium.com/@prashantkhandelwal/marvel-comics-universe-graph-database-7264381478e1") if err != nil { log.Fatalf("ERROR: Unable to retrieve metadata - %v", err.Error()) } fmt.Println(w.Title) fmt.Println(w.Description)
Executing above code will generate the output like this:
Similarly, you can also execute this function multiple times to get other information from the web page as desired.
Here is the complete code for reference:
package main import ( "context" "crypto/rand" "fmt" "log" "os" "time" "unsafe" "github.com/chromedp/cdproto/emulation" "github.com/chromedp/chromedp" ) const ( UserAgentName = "Websnapv1.0" Path = "images\\" Timeout = 15 ) type WebData struct { Title string Description string } // https://stackoverflow.com/questions/22892120/how-to-generate-a-random-string-of-a-fixed-length-in-go func GenRandStr(n int) string { var alphabet = []byte("abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ") b := make([]byte, n) rand.Read(b) for i := 0; i < n; i++ { b[i] = alphabet[b[i]%byte(len(alphabet))] } return *(*string)(unsafe.Pointer(&b)) } func Snap(url string) (string, error) { ctx, cancel := chromedp.NewContext(context.Background()) defer cancel() // Creating timeout for 15 seconds ctx, cancel = context.WithTimeout(ctx, time.Second*Timeout) defer cancel() name := GenRandStr(12) var buf []byte var file = Path + name + ".png" err := chromedp.Run(ctx, emulation.SetUserAgentOverride(UserAgentName), chromedp.Navigate(url), //chromedp.FullScreenshot(&buf, 100), chromedp.CaptureScreenshot(&buf), ) if err != nil { log.Fatalf("ERROR:webext - Unable to extract meta(s) from the given URL - %s\n", url) return "", err } if err := os.WriteFile(file, buf, 0o644); err != nil { log.Fatalf("ERROR:webext - Cannot create snap for the link - %s\n", url) file = "" return "", err } return file, nil } func ExtractMeta(url string) (*WebData, error) { ctx, cancel := chromedp.NewContext(context.Background()) defer cancel() // Creating timeout for 15 seconds ctx, cancel = context.WithTimeout(ctx, time.Second*Timeout) defer cancel() var pageTitle, description string var w = &WebData{} err := chromedp.Run(ctx, emulation.SetUserAgentOverride(UserAgentName), chromedp.Navigate(url), chromedp.Title(&pageTitle), chromedp.Evaluate(`document.querySelector("meta[name^='description' i]").getAttribute('content');`, &description), ) if err != nil { log.Fatalf("ERROR:webext - Unable to extract meta(s) from the given URL - %s\n", url) return nil, err } w.Title = pageTitle w.Description = description return w, nil } func main() { i, err := Snap("https://medium.com/@prashantkhandelwal/marvel-comics-universe-graph-database-7264381478e1") if err != nil { log.Fatalf("ERROR: Unable to retrieve screenshot - %v", err.Error()) } // print the image path fmt.Println(i) // Extract metadata of the page w, err := ExtractMeta("https://medium.com/@prashantkhandelwal/marvel-comics-universe-graph-database-7264381478e1") if err != nil { log.Fatalf("ERROR: Unable to retrieve metadata - %v", err.Error()) } fmt.Println(w.Title) fmt.Println(w.Description) }
References