获取访问可汗学院挑战赛的内部API

3

我正在编写一个与Khan Academy集成的应用程序,想知道有没有人知道如何获取学习者完成的挑战?例如,我已经登录并完成了下面编程播放列表中的一些挑战。

https://www.khanacademy.org/computing/computer-programming/programming

当我查看页面本身时,它显示一些标记为已完成的挑战,但是页面上的Chrome开发者控制台没有显示任何XHR API调用以获取该信息。那么有人找到了哪个内部API需要获取哪些挑战已完成吗?


根据Ben Kraft的建议,我尝试了: '/api/v1/user/progress_summary?kind=Exercise' 并得到: {"started":[],"complete":["ex8e7aac0b"]}

使用: '/api/internal/user/kaid_688515334519823186196256/progress?dt_start=2017-08-15T00:00:00.000Z&dt_end=2018-08-25T00:00:00Z'

我得到了很多数据,但我不知道可以使用哪些其他参数来锁定我想要的信息(Intro to JS课程中完成的挑战)。


1
不确定为什么这个被投票否决了,我只能说Khan Academy自己建议查看内部API以获取公共API上没有的数据访问权限。 - Mark Ellul
3个回答

1
这是一个经过大幅修改的 Khan API 示例,可以准确地实现您要求的功能(我之前也需要同样的信息)。
import cgi
import rauth
import SimpleHTTPServer
import SocketServer
import time
import webbrowser
import requests

students = ['student1@email.com','student2@email.com']
courses = ['programming','html-css','html-css-js','programming-games-visualizations']

# You can get a CONSUMER_KEY and CONSUMER_SECRET for your app here:
# http://www.khanacademy.org/api-apps/register
CONSUMER_KEY = 'abcdefghijklmnop'
CONSUMER_SECRET = 'qrstuvwxyz123456'

CALLBACK_BASE = '127.0.0.1'
SERVER_URL = 'http://www.khanacademy.org'
VERIFIER = None


# Create the callback server that's used to set the oauth verifier after the
# request token is authorized.
def create_callback_server():
    class CallbackHandler(SimpleHTTPServer.SimpleHTTPRequestHandler):
        def do_GET(self):
            global VERIFIER

            params = cgi.parse_qs(self.path.split('?', 1)[1],
                keep_blank_values=False)
            VERIFIER = params['oauth_verifier'][0]

            self.send_response(200)
            self.send_header('Content-Type', 'text/plain')
            self.end_headers()
            self.wfile.write('OAuth request token fetched and authorized;' +
                ' you can close this window.')

        def log_request(self, code='-', size='-'):
            pass

    server = SocketServer.TCPServer((CALLBACK_BASE, 0), CallbackHandler)
    return server


# Make an authenticated API call using the given rauth session.
def get_api_resource(session):
    start = time.time()
    allProgress = []

    for student in students:
        print "Getting key for",student
        url = SERVER_URL + '/api/v1/user?email=' + student
        split_url = url.split('?', 1)
        params = {}

        # Separate out the URL's parameters, if applicable.
        if len(split_url) == 2:
            url = split_url[0]
            params = cgi.parse_qs(split_url[1], keep_blank_values=False)

        response = session.get(url, params=params)
        studentKhanData = response.json()

        try:
            if student != studentKhanData['student_summary']['email']:
                print "Mismatch. Khan probably returned my data instead."
                print "This student probably needs to add me as a coach."
                print "Skipping",student
                continue
            key = studentKhanData['student_summary']['key']
        except TypeError as e:
            print "Error:",e
            print "Does this student have a Khan account?"
            print "Skipping",student
            continue

        individualProgress = []
        for course in courses:
            print "Getting",course,"progress for",student
            ts = int(time.time()*1000)
            url = SERVER_URL + '/api/internal/user/topic-progress/' + course + '?casing=camel&userKey=' + key + '&lang=en&_=' + str(ts)
            print url
            split_url = url.split('?', 1)
            params = {}

            # Separate out the URL's parameters, if applicable.
            if len(split_url) == 2:
                url = split_url[0]
                params = cgi.parse_qs(split_url[1], keep_blank_values=False)

            response = session.get(url, params=params)
            progressData = response.json()
            progressArray = progressData['topicProgress']

            challengeCount = 0
            for activity in progressArray:
                if activity['status'] == 'complete' and activity['type'] == 'challenge':
                    challengeCount += 1

            individualProgress.append(challengeCount)

        allProgress.append([student,individualProgress])

    for x in allProgress:
        print x

    print "\n"
    end = time.time()
    print "\nTime: %ss\n" % (end - start)

def run_tests():
    # Create an OAuth1Service using rauth.
    service = rauth.OAuth1Service(
           name='autoGrade',
           consumer_key=CONSUMER_KEY,
           consumer_secret=CONSUMER_SECRET,
           request_token_url=SERVER_URL + '/api/auth2/request_token',
           access_token_url=SERVER_URL + '/api/auth2/access_token',
           authorize_url=SERVER_URL + '/api/auth2/authorize',
           base_url=SERVER_URL + '/api/auth2')

    callback_server = create_callback_server()

    # 1. Get a request token.
    request_token, secret_request_token = service.get_request_token(
        params={'oauth_callback': 'http://%s:%d/' %
            (CALLBACK_BASE, callback_server.server_address[1])})

    # 2. Authorize your request token.
    print "Get authorize URL"
    authorize_url = service.get_authorize_url(request_token)
    print authorize_url
    webbrowser.open(authorize_url)
    #It is possible to automate this part using selenium, but it appears to be against Khan Academy's Terms of Service

    callback_server.handle_request()
    callback_server.server_close()

    # 3. Get an access token.
    session = service.get_auth_session(request_token, secret_request_token,
        params={'oauth_verifier': VERIFIER})

    # Repeatedly prompt user for a resource and make authenticated API calls.
    print
    #while(True):
    get_api_resource(session)


def main():
    run_tests()

if __name__ == "__main__":
    main()

1
我认为/api/v1/user/progress_summary是最好的选择。我不确定为什么它没有列在API浏览器中,但这是内部文档:
Return progress for a content type with started and completed lists.
Takes a comma-separated `kind` param, like:
    /api/v1/user/progress_summary?kind=Video,Article
and returns a dictionary that looks like:
    {"complete": ["a1314267931"], "started": []}

您还需要传递类似于其他/api/v1/user路由的kaid用户标识符。如果您想要更多有关单个内容项的数据,这些ID应该与您可以从主题树API获取的数据匹配。就我所知,这正是我们在主题页面上使用的相同数据。


最终,kaid不需要,因为这是对API的身份验证调用,我将使用的端点是“/api/v1/user/progress_summary?kind=Article,Scratchpad,Video,Exercise”。 - Mark Ellul
请将以下与编程有关的内容从英语翻译成中文。仅返回翻译后的文本:PS:结果与主题树不完全匹配,必须删除每个结果的第一个字母以匹配主题树,即["a1314267931"]应为["1314267931"]。 - Mark Ellul

0
经过一些调查,我找到了内部API。路径如下。用户KAID可以从公共/api/v1/users调用中找到。dt_start和dt_end是您要获取进度的时间范围。
/api/internal/user/[USER KAID]/progress?dt_start=2016-05-13T22:00:00.000Z&dt_end=2016-05-21T00:00:00Z&tz_offset=120&lang=en&_=1463730370107

我希望这能帮助到未来的其他人。


你想用这些数据做什么?我不确定/progress是否与我们在主题页面上展示的完全相同,但如果你真正想要的是用户最近完成的内容,那可能更好!如果你想要他们完成的所有内容,可能有另一个更好的调用。 - Ben Kraft
我们正在使用用户练习和视频的公共API。我们想找出用户完成了哪些挑战和阅读了哪些文章。不幸的是,内部进度调用只能获取挑战信息。如果您知道一个API方法可以获取他们完成的所有内容,能否在回答中提供一下? - Mark Ellul

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接