Install Steam
login
|
language
简体中文 (Simplified Chinese)
繁體中文 (Traditional Chinese)
日本語 (Japanese)
한국어 (Korean)
ไทย (Thai)
Български (Bulgarian)
Čeština (Czech)
Dansk (Danish)
Deutsch (German)
Español - España (Spanish - Spain)
Español - Latinoamérica (Spanish - Latin America)
Ελληνικά (Greek)
Français (French)
Italiano (Italian)
Bahasa Indonesia (Indonesian)
Magyar (Hungarian)
Nederlands (Dutch)
Norsk (Norwegian)
Polski (Polish)
Português (Portuguese - Portugal)
Português - Brasil (Portuguese - Brazil)
Română (Romanian)
Русский (Russian)
Suomi (Finnish)
Svenska (Swedish)
Türkçe (Turkish)
Tiếng Việt (Vietnamese)
Українська (Ukrainian)
Report a translation problem
I also have a project that revolves around the group forums which is partially scraping data.
Scraping is absolutely allowed. The only thing that Valve forbids int their service is automation (like market transactions, spamming, etc...)
It is just a stupid thing that I am thinking of doing. It is not really useful but I just wanted to analyze some game forums (not the popular ones).
I never said it was not allowed or allowed.
It is a general consensus that you do not scrape websites and that it is frowned upon, hence why you get temp ip bans more often than you would get from overstepping rate limits on apis.
now i have repeated myself twice if need be i can do it in deutsch as well, if you still don't understand what i wrote.
as not every topic is instantly locked / deleted, it could be days, weeks, months or years. you would have to check the tbl for that change.
i did make something similar but again i don't actually update the tbl row or check all the topics for changes.
but i guess it is somewhat manageable if you do it smart i guess, getting 50 topics per page.
in short duration i can see this be manageable but once you db tbl hits more than 50k topics then you are going to have trouble i guess.
i just skimmed the discussions and just by a glance i can tell from the english forums alone that there are 1 million + discussions.
but ye if you do limit it to the 50k i guess you could get some good data.
I wouldn't dare analyzing that much plus how would I tell which thread is a troll and which one is not.
(since you use python and it has a wide community of contributors, i bet you can find something basic to help you with getting that.)