파이썬 Input함수의 시간초과, sys.stdin.readline

거대한 정보의 더미들/파이썬 코딩테스트 정리

파이썬 Input함수의 시간초과, sys.stdin.readline - 스택의 구현(백준 10828),

PurpleGuy101 2023. 1. 22. 14:43

백준 알고리즘 기초 (1/2)를 본격적으로 풀어보자.

자료구조수업에서 배웠던 stack이다.

파이썬 역시 list 자료형이 기본적으로 주어지기에,

쉽게 구현하였다.

스택을 포함한 자료구조를 구현할 때 조심해야 할 점은

아래의 기초연산을 제외한 다른 연산을 막 사용했다가

알고리즘의 시간복잡도를 더 늘려버릴 수도 있기에 최대한 기초연산을

생각하며 구현할 수 있도록 노력하는 것이 아닐까?

물론 append가 O(1)의 연산이기는 하지만,

C언어에서 배운 array 구조의 원초적인 접근은

따로 어레이의 길이를 추적하는 length에 해당하는 변수를

만들어서 꾸준히 추적하는 것이 핵심이라고 생각을 했었다.

다만 후술할 시간초과의 문제는 append나 pop을 막 썼다고 생기는 문제는 아니고

input함수가 그만큼 빠르지 않아서 발생하는게 원인이었으나..

파이썬의 내장함수에만 의지하면 언젠가 다른 언어를 배울 때에 걸림돌이 될 것을 어렴풋히 느낄 수 있다.

투박하지만 강인하게 코딩을 해보자. (사실 C++말고 파이썬을 택한 양?반이 할 소리는 아니다..)

arr = [None]*10000

len = 0

iteration = int(input(""))

for i in range(iteration):
    ayy = input("")
    if "push" in ayy:
        pushing_number = int(ayy.split(" ")[-1])
        len += 1
        arr[len-1] = pushing_number
    elif "top"==ayy:
        if len==0:
            print(-1)
        else:
            print(arr[len-1])
    elif "size"==ayy:
        print(len)
    elif "empty"==ayy:
        if len!=0:
            print(0)
        else:
            print(1)
    elif "pop"==ayy:
        if len==0:
            print(-1)
        elif len>0:
            print(arr[len-1])
            len += -1

이렇게 작성한 코드는 시간초과의 문제가 발생한다.

사실 리스트의 내장함수를 써서 문제가 발생한 것은 아니고

표준입출력 과정에서 사용하는 input함수가 상당히 느린편에 속하기 때문.

input 대신 sys 라이브러리를 가져와서

sys.stdin.readline()을 수행하면

C언어에서 gets함수처럼 한줄을 읽어온다.

import sys


arr = [None]*100000

len = 0

iteration = int(sys.stdin.readline())

for i in range(iteration):
    ayy = sys.stdin.readline().split()
    if "push" == ayy[0]:
        pushing_number = ayy[1]
        len += 1
        arr[len-1] = pushing_number
    elif "top"==ayy[0]:
        if len==0:
            print(-1)
        else:
            print(arr[len-1])
    elif "size"==ayy[0]:
        print(len)
    elif "empty"==ayy[0]:
        if len!=0:
            print(0)
        else:
            print(1)
    elif "pop"==ayy[0]:
        if len==0:
            print(-1)
        elif len>0:
            print(arr[len-1])
            len += -1

이렇게 하면 시간초과없이 통과 완료.

왜 이런건지 우리 chatgpt선생님에게 물어보자.

The input() function in Python is slower than the sys.stdin.readline() function because input() performs additional tasks that sys.stdin.readline() does not.The input() function first reads the user's input as a string, and then performs an evaluation of the string to determine the type of data it contains. This evaluation process can be slow, especially if the input string is large or complex. The input() function also automatically strips leading and trailing whitespaces from the input string.On the other hand, sys.stdin.readline() only reads the user's input as a string, without performing any additional evaluations or formatting. This makes it faster than input(). Additionally sys.stdin.readline() is suitable when we are reading inputs from large amount of data.The input() function is more convenient to use in most cases because it automatically returns the input as the correct data type, and it automatically strips leading and trailing whitespaces. However, if performance is a concern and you only need to read a single line of input, sys.stdin.readline() is a faster option.

input 함수는 sys.stdin.readline보다 하는 일이 많다.

input함수는 먼저 사용자의 입력을 문자열로 입력받아서, 이 문자열이 어떤 자료형(data type)을 가지고 있는지 판별한다. 이 판별의 과정이 느려질 수 있는데, 특히 입력받은 문자열이 너무 크거나 복잡할 때이다.
또한 input함수는 입력받은 문자열의 앞과 뒤의 공백(whitespaces)를 자동으로 자르는 과정이 포함된다.

sys.stdin.readline()은 이러한 과정이 없다. input은 입력받은 데이터의 자료형을 판별하고 시작과 끝의 공백을 지워주는 과정없이 받은 인풋을 그대로 문자열로 저장한다. 사용자로부터 한줄한줄 입력을 받을 때는 input이 편리하다.
반대로, 만약 더빠른 퍼포먼스를 원한다면 sys.stdin.readline을 쓰는게 좋은 방법이다.