Google searches for "Netflix X" where X is a movie title usually find the Netflix page for that movie, even though their robots.txt prevent it being searched. (Not too surprising--- if there are enough links, Google can find it even if it can't scan the page itself.)
So out of curiousity I took a look at robots.txt and not only is it commented but it also has a disabled portion listed:
# Uncomment this when we start generating sitemaps again.
#Sitemap:
http://movies.netflix.com/sitemap_Movies.xml.gz The implication is that the deployment process for the website just does a "copy", not a "build." This is pretty common for websites--- you can find lots of comments and comment-disabled portions in HTML and Javascript documents.
http://www.tintri.com/robots.txt has a ton of boilerplate text that obviously came with the web server or framework.
There are exceptions:
http://cnn.com/robots.txt doesn't have any comments.
http://facebook.com/robots.txt has comments that are directed at outside people (like they should be)! But what surprises me most is that I couldn't quickly find tools or best-practices guide for stripping out "internally"-directed comments, other than the JavaScript compaction + obfuscation tools whose main goal is reducing size.