It seems like every few months I find myself cracking the login and upload or download process on some site -- sorry, some "web application". Invariably they either don't provide an API, or their API is wholely inadequate. The "new web" doesn't want you to script it, because that might prevent them from forcing lock-in on you. They all want to be titans of the industry like Compuserve or AOL, apparently not having heard about this little thing called "The Internet" that got really popular for a minute back in the 90s.
So to do the things I want to do, I often have to crack their undocumented protocols and halfassed security measures. I don't enjoy it, but for my sanity and out of self defense, I do it a lot. "Nation Suddenly Realizes This Just Going To Be A Thing That Happens From Now On".
The kind of discoveries I end up needing to make usually look like:
- Their OAuth "application" API is inadequate and intentionally crippled, so let's go straight for the web login page and get a session cookie.
- Oh look, here's the magic URL you are squirting JSON data down.
- Oh, but the arguments to that URL are signed.
- Oh, here's the signing key you embedded in the code but tried to hide.
- (And you're sniffing user agents. Aw, that's cute.)
I don't have proper tools to easily do the sorts of things I need to do to solve these problems. I mean, I manage, obviously, but it sucks. Here are the kind of questions I find myself asking that are harder to answer than they should be:
- This form's "Submit" button isn't actually a form element, and the source doesn't have an onclick handler on it. Something somewhere else has installed a handler ...somewhere... so that when I click it, a JS function runs and a URL gets loaded. What function? What URL?
- Clicking this thing reads and writes a bunch of data to random URLs via XMLHttpRequest, then does a redirect. What URLs did it load and what did it send and recieve? Sometimes I can answer this question using the Resources or Timeline panel in Safari's inspector, but as far as I can tell, the intermediate data vanishes from the timeline as soon as the top-level URL changes, or the DOM gets zeroed out, or something. I don't know. I just know that I can't see a record of URLs being loaded that I know were loaded. Mozilla and Firebug don't seem to be any better than Safari in this respect. "Oh, the document is gone, you must not care about it any more."
I could use mitmproxy and Wireshark for some of this, but that's a huge pain in the ass, and more heavy-handed that I usually need. Also Wireshark is awful (it always leaves me thinking "How was this supposed to be any better than tcpdump?") It makes much more sense to intercept this stuff inside the browser. All the information is in there since it's the thing initiating contact with the server.
Previously, previously, previously, previously, previously, previously, previously.
Chrome devtools lets you preserve the network and console tabs’ contents when the page changes.
Maybe you should write a browser.
Nearly everything you're asking for is inside of the Chrome debugger. Say what you will of Google, but their developers have made a really excellent debugging tool here. I have been doing a considerable amount of JS debugging lately and I don't think I could do it using say, Safari or Firefox. Their developer tools are horrible.
You can monitor XHR, attack JS directly, modify JS code in place and see what handlers are attached to what DOM Objects. That's more than enough to take apart most APIs.
Chrome's JS debugger is fantastic. You can set breakpoints on XHR and trace async calls. It'll even maintain breakpoints across page reloads.
I was going to say the same thing. One additional feature to note in Chrome is that you can inspect an element, say a fake form button, and see the event listeners attached to it.
You can, but this gets a bit tricky as you'll invariably end up in the framework code since almost no one binds event listeners directly, instead using something like jQuery. It takes a bit of work to figure out how to get to the actual bindings from there.
I find it is usually simpler to break on the XHR request and then inspect the stack trace to follow the execution flow back to the actual bound application code.
Eh, you say that but last time I checked Chrome's debugger had two misfeatures that render it more or less useless for reverse engineering other peoples' code.
Not my experience at all debugging React, Meteor, or Angular, all "frameworks du jour".
While I agree it's difficult to debug minimized JS, the minimized debugging problem has been pretty much fixed in recent versions and pretty print allows you to set breakpoints within pretty-printed JS. Haven't had issues with that in awhile.
Also, contenting that everyone is an "idiot" exercising "idiocy" by using frameworks doesn't strengthen your argument. Not everyone wants to write the world from first principles like it's 1995.
Well, I do, but I accept that I am in the sad situation of needing to crack systems written by people with more regrettable tastes than mine.
Nevermind, you're interested in what's going on in the browser end, I'd use the chrome debugging tools.
That is the worst possible name for that program.
It looks intentional, especially as there's an application in the suite called 'handjoob'.
Cool casual sexism. Are they 12 years old? I'm embarrassed for everyone involved.
For HTTP-level things, I've had some luck with Charles - it has a workable timeline or per-host view, and it can MITM TLS connections for you to show you what's going on under the hood.
On the other hand, I don't believe that the Firefox dev tools let you set a breakpoint on XHR activity, which is a nice feature of the Chrome dev tools.
A golden tool for this type of work is Fiddler. It neatly inserts itself as an HTTP proxy, and allows you to inspect and modify all HTTP(S) traffic between a browser and a host. Anything the browser does, you can do, and anything the server sends back, you can capture.
In addition to being helpful for reverse engineering, it's also quite handy for debugging.
Fiddler is an alright tool, but:
2. Fiddler is Windoze-only. He's using Safari. My Common Sense is tingling, and it's telling me he won't want to run a VM just for this.
I believe that jwz wouldn't run Windows for anything, even with a gun to his head.
Key quote: "Microsoft killed my company, and I hold a personal grudge. I don't use any Microsoft products and neither should you."
I forget how extensively WINE is also out of the VMing Core-question. Poly-nope @as_*void() not a Lambda 2GB hunk of DDR3 we'd wink at http://blog.dustinkirkland.com/2016/03/ubuntu-on-windows.html
Oh...if you'll tolerate mono, http://fiddler.wikidot.com/mono (in alpha (blogged e.g. http://www.telerik.com/blogs/fiddler-for-linux-updated etc. books, google groups, StackFapper, etc.) ; non-alpha is written to .Net 2 and 4 by telerik.com.)
Agreeing about Chrome dev tools being awesome for stuff like this.
I'd also add that CasperJS or simply PhantomJS are also useful for scripting these kinds of interactions if you're too lazy to reverse-engineer low level details and just want to do high-level interactions in a headless browser.
Is it less efficient to actually load and render the page in a headless browser and simulate an onclick event? Yes.
Do you need the cronjob you're using to download ultraporn to be so computationally efficient it'll be an example in the next edition of AoCP? Probably not.
I'll continue the irrelevant thread by mentioning that in addition to PhantomJS, the Selenium project is pretty good at automating a number of real browsers and giving you access to the DOM so you can do things like "when you see a form field with name like 'login' or 'username' put my username in there" or "when you see a submit button, click it"
As an added bonus it has Perl support!
If you're using wireshark, then you should instead be using some sort of mitm proxy. People have suggested burp (free version sucks), fiddler (Windows) and Charles (not terrible), I'd actually suggest OWASP ZAP.
For client side stuff, Chrome's tools are far better than Safari's.
As with option #1 in https://www.jwz.org/doc/backups.html , the only winning move is not to play.
httpfox solves all your "what network request went where and what did it contain" needs. It won't forget them until told so, and it captures everything, including plugin activity.
If you are really lazy you can buy a copy of Fake from the app store. It is a browser with an Automator like scripting language. I use it to fetch website logs and financial information. The main advantage is that it provides a superficial interface so you don't have to grovel deep inside ten layers of JS package and DOM structure to figure out what is happening.
Chrome debugging tools will also render every request it makes as a curl command, if you right click on it in the network debug. But for the kind of puppeting of the zombie corpse of web development you're talking about, PhantomJS is a headless web browser with a sane API that - so far - gets around most of the "You're not my real browser" tricks.
I second Burp Proxy.
while it's out-of-browser (and not open source), Chrome and Firefox dev tools pale in comparison to it, for this use case. And it's arguably less of a PITA than mitmproxy.
I don't know why more people aren't saying Firebug. It's a lot better than the johnny-come-lately built-in Firefox debugger.
Open Firebug, HTML tab, click the arrow thingy. Select the pesky submit button (either by moving the mouse over the document and clicking, or by using the HTML view). Now go to the events tab on the right to get all applicable listeners.
Open Firebug, network tab, toggle the "Persist" button. Firebug now remains open and keeps history between page redirects. You can also do the same on the console tab.
Open Firebug, HTML tab, select the video element, right click and "view in DOM inspector". It's the currentSrc value.
Open Firebug, network tab, the little yellow pause icon with "XHR" written on it ("Break on XHR"), which will bring you into the script debugger at the next request. Yes, it's not the same thing as breaking on all network activity, but unless the page is using document.write("<img>") smoke signals for communication, usually it's all you need.
Once taken to the script debugger, go to the "Stack" tab which has the full stack trace.
Thanks for mentioning Firebug. And that it's not the built-in anymore. And the walksies.
It sounds like you want something somewhat scriptable ... how about https://github.com/sidorares/crconsole