안녕하세요 ~ 매트릭스 연산 질문입니다!

글쓴이: rudolph0724 / 작성시간: 화, 2015/03/10 - 9:55오후

코드1

for(int row = 0; row < numberOfRow; row++)
{
          for(int col = 0; col < numberOfCol; col++)
          {
                      matrix[row][col] = matrixOne[row][col] + matrixTwo[row][col];
          }
}

코드2

for(int col = 0; col < numberOfCol; col++)
{
          for(int row = 0; row < numberOfRol; row++)
          {
                      matrix[row][col] = matrixOne[row][col] + matrixTwo[row][col];
          }
}

코드1 동작시간 329 ms
코드2 동작시간 28305 ms

우선 코드1과 코드2가 알고리즘 측면에서는 시간이 같아야 하지만 코드2가 시간이 오래 약 90배 오래 걸립니다...
아시는 분있으면 명확한 이유와 설명 부탁드러요 ...

옛날에 배운듯한데 .. 하하하..

메모리안에 매트릭스를 저장하는 방식 때문인가..

만약 매트릭스를 각 행을 기준으로 열을 나란하게 저장하면 코드2는 좀 더 뭔가 ... ~

하하 잘모르겠네요 ^^ 부탁드립니다

Forums:

프로그래밍 QnA

댓글 달기

테스트가 잘 안되네요.

글쓴이: shint / 작성시간: 화, 2015/03/10 - 10:35오후

같은 값을 많이 주면. 메모리를 넘게 되고... 정적 배열이라서... ㅇ_ㅇ;;
값을 다르게 주면. 스택 오버플로우라고 ASSERT에서 메시지가 뜨네요.

	//코드1
	const int numberOfRow = 200;
	const int numberOfCol = 200;
	int matrix[numberOfRow][numberOfCol];
	int matrixOne[numberOfRow][numberOfCol];
	int matrixTwo[numberOfRow][numberOfCol];
 
	DWORD dwStart = GetTickCount();
	for (int row = 0; row < numberOfRow; row++)
	{
		for (int col = 0; col < numberOfRow; col++)
		{
			matrix[row][col] = matrixOne[row][col] + matrixTwo[row][col];
			ASSERT(matrix[row][col]);
		}
	}
	DWORD dwEnd = GetTickCount();
	printf("%ld\n", dwEnd - dwStart);
 
 
	//코드2
	dwStart = GetTickCount();
	for (int col = 0; col < numberOfCol; col++)
	{
		for (int row = 0; row < numberOfCol; row++)
		{
			matrix[row][col] = matrixOne[row][col] + matrixTwo[row][col];
			ASSERT(matrix[row][col]);
		}
	}
	dwEnd = GetTickCount();
	printf("%ld\n", dwEnd - dwStart);

----------------------------------------------------------------------------
젊음'은 모든것을 가능하게 만든다.

매일 1억명이 사용하는 프로그램을 함께 만들어보고 싶습니다.
정규 근로 시간을 지키는. 야근 없는 회사와 거래합니다.

각 분야별. 좋은 책'이나 사이트' 블로그' 링크 소개 받습니다. shintx@naver.com

답글

비슷한 답을 찾았습니다.

글쓴이: shint / 작성시간: 목, 2015/07/02 - 10:03오후

C언어에서 for()문의 배열 접근순서를 바꿔서 속도가 10배 향상이 됩니다. ★★★★★
https://kldp.org/node/152370

----------------------------------------------------------------------------
젊음'은 모든것을 가능하게 만든다.

매일 1억명이 사용하는 프로그램을 함께 만들어보고 싶습니다.
정규 근로 시간을 지키는. 야근 없는 회사와 거래합니다.

각 분야별. 좋은 책'이나 사이트' 블로그' 링크 소개 받습니다. shintx@naver.com

답글

http://stackoverflow.com/ques

글쓴이: esrevinu / 작성시간: 화, 2015/03/10 - 11:03오후

http://stackoverflow.com/questions/9936132/why-does-the-order-of-the-loops-affect-performance-when-iterating-over-a-2d-arra

CPU가 데이터를 메모리에서 캐시 라인이라는 덩어리로 읽어 오기 때문이라네요. C는 배열이 row-major order로 저장되고 따라서 column 루프가 먼저 되는 것이 캐시 적중률이 높겠죠.

답글

감사합니다

글쓴이: rudolph0724 / 작성시간: 금, 2015/03/13 - 9:05오후

감사합니다 저도 나중에 알고보니 cpu 지역성 문제더라고요 ^^

즐거운 하루되세요

답글

캐시와 지역성에 관한 내용이네요.

글쓴이: yukariko / 작성시간: 수, 2015/03/11 - 12:58오전

캐시와 지역성에 관한 내용이네요.

답글

감사합니다

글쓴이: rudolph0724 / 작성시간: 금, 2015/03/13 - 9:06오후

네 감사합니다
^^

답글

코드 수정했습니다.

글쓴이: rudolph0724 / 작성시간: 수, 2015/03/11 - 7:51오전

코드에 오타가 있어서 수정했습니다
1번 코드의 2번째 for문의 numberOfRow를 numberOfCol 로..
2번 코드의 2번째 for문의 numberOfCol을 numberOfRow 로..

코드는 너무 신경쓰지마세용 ㅎㅎ..

문제를 다시 설명 드리면

10000x100000 매트릭스(초기값은 상관없습니다 즉 모두 0 이어도 상관없습니다) 두개를가 주어졌을때.
두 매트릭스의 각각의 값을 합하는 코드입니다.

코드 1번과 2번의 차이는 2개의 반복문(중첩)의 순서가 행,열 아니만 열,행 차이 입니다.

결과를 보면 열,행 순서의 반복문(코드2)이 행,열 순의 반복문 보다 매우 느려서 이렇게 질문을 드린겁니다 ㅎㅎ

2번 코드를 다시올립니다.

int[][] sum(int [][]matrixOne, int [][]matrixTwo)
{

int [][]matrix = null;

if(matrixOne.length == matrixTwo.length
&& matrixOne[0].length == matrixTwo[0].length)
{
int numberOfRow = matrixOne.length;
int numberOfCol = matrixOne[0].length;

matrix = new int[numberOfRow][numberOfCol];

for(int col = 0; col < numberOfCol; col++) //2번 코드
{

for(int row = 0; row < numberOfRow; row++)
{
matrix[row][col] = matrixOne[row][col] + matrixTwo[row][col];
}
}

}
else
{
System.out.println("mismatch row, col");
}

return matrix;
}

답글

요즘 웬만한 컴파일러는 최적화 해주지 않나요? 최적화 옵션을 좀 높이면..

글쓴이: 익명 사용자 / 작성시간: 수, 2015/03/11 - 10:02오전

C나 여타 근래의 대부분의 언어는 Row Major라면, Row Major 라면 Row Major 답게 배열에 접근해야죠. 바로 옆에 인접한 데이터는 row가 같고 column이 +,- 1 인 데이터입니다.

Fortran은.. (그리고 매트랩 등)은 Column Major 입니다. 포트란에서 속도를 높이려면 두 번째처럼 작성해야 합니다. C와 반대로 Row가 +, -1 이고 column 이 같은 데이터가 인접한 데이터입니다.

10년도 더된 Compaq visual fortran 매뉴얼 해당 페이지
http://h21007.www2.hp.com/portal/download/files/unprot/Fortran/docs/vf-html/pg/pguaracc.htm

gcc, gfortran의 경우도 관련 메뉴얼을 찾아보면, 특정 라이브러리를 이용하면 자동으로 최적화시킬 수도 있던 것으로 기억합니다.(최적화 옵션에 따라..)

기본적으로 rowmajor답게 작성해야죠 ㅋ

답글

페이지 보시면 loop transform 옵션을 달면

글쓴이: yeonpil_net / 작성시간: 수, 2015/03/11 - 10:11오전

많이 쓰는 메이저 포트란 컴파일러들은 자동으로 루프 분석해서 row major로 작성된(C나 다른 언어 하는 사람들의 습관이나.. 수학적 표현을 매트릭스로 표현하려다 row major로 코딩하는 경우...)를
포트란 column major로 자동 최적화해주는 기능이 있습니다.

!23456---1----+----2----+----3----+----4----+----5----+----6----+----7-2--+----8
"배웠다"는 "할 수 있다"의 동의어가 아니다.

답글

Graphite

글쓴이: yeonpil_net / 작성시간: 수, 2015/03/11 - 11:23오전

https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html

------------------------------------------
-ftree-loop-linear
Perform loop interchange transformations on tree. Same as -floop-interchange. To use this code transformation, GCC has to be configured with --with-isl to enable the Graphite loop transformation infrastructure.
-floop-interchange
Perform loop interchange transformations on loops. Interchanging two nested loops switches the inner and outer loops. For example, given a loop like:

DO J = 1, M
DO I = 1, N
A(J, I) = A(J, I) * C
ENDDO
ENDDO

loop interchange transforms the loop as if it were written:

DO I = 1, N
DO J = 1, M
A(J, I) = A(J, I) * C
ENDDO
ENDDO

which can be beneficial when N is larger than the caches, because in Fortran, the elements of an array are stored in memory contiguously by column, and the original loop iterates over rows, potentially creating at each access a cache miss. This optimization applies to all the languages supported by GCC and is not limited to Fortran. To use this code transformation, GCC has to be configured with --with-isl to enable the Graphite loop transformation infrastructure.
------------------------------------------

!23456---1----+----2----+----3----+----4----+----5----+----6----+----7-2--+----8
"배웠다"는 "할 수 있다"의 동의어가 아니다.

답글

+1

글쓴이: HDNua / 작성시간: 목, 2015/03/12 - 4:36오후

이런 게 있는 모양이네요. 댓글 남겨놓아야겠습니다.

저는 이렇게 생각했습니다.

답글

학부때 들었던 운영체제 개론이 생각나네요

글쓴이: twinwings / 작성시간: 월, 2015/07/06 - 5:32오전

- 최적화의 기본은 예측으로 부터 나온다.
- 예측의 기본은 time locality(한번 읽었던 메모리는 조만간 쓰인다.) /
space locality(한번 읽었던 메모리의 근방은 곧 읽힐 것 이다)

사실 row/col 속도 차이는 언어 때문에 차이 나는게 아니라,
컴퓨터의 아키텍처에 기인한다고 생각합니다.

답글

댓글 달기

이름

제목

댓글 *

텍스트 포맷에 대한 자세한 정보

텍스트 양식

Filtered HTML

텍스트에 BBCode 태그를 사용할 수 있습니다. URL은 자동으로 링크 됩니다.
사용할 수 있는 HTML 태그: <p><div><span><br><a><em><strong><del><ins><b><i><u><s><pre><code><cite><blockquote><ul><ol><li><dl><dt><dd><table><tr><td><th><thead><tbody><h1><h2><h3><h4><h5><h6><img><embed><object><param><hr>
다음 태그를 이용하여 소스 코드 구문 강조를 할 수 있습니다: <code>, <blockcode>, <apache>, <applescript>, <autoconf>, <awk>, <bash>, <c>, <cpp>, <css>, <diff>, <drupal5>, <drupal6>, <gdb>, <html>, <html5>, <java>, <javascript>, <ldif>, <lua>, <make>, <mysql>, <perl>, <perl6>, <php>, <pgsql>, <proftpd>, <python>, <reg>, <spec>, <ruby>. 지원하는 태그 형식: <foo>, [foo].
web 주소와/이메일 주소를 클릭할 수 있는 링크로 자동으로 바꿉니다.

BBCode

텍스트에 BBCode 태그를 사용할 수 있습니다. URL은 자동으로 링크 됩니다.
다음 태그를 이용하여 소스 코드 구문 강조를 할 수 있습니다: <code>, <blockcode>, <apache>, <applescript>, <autoconf>, <awk>, <bash>, <c>, <cpp>, <css>, <diff>, <drupal5>, <drupal6>, <gdb>, <html>, <html5>, <java>, <javascript>, <ldif>, <lua>, <make>, <mysql>, <perl>, <perl6>, <php>, <pgsql>, <proftpd>, <python>, <reg>, <spec>, <ruby>. 지원하는 태그 형식: <foo>, [foo].
사용할 수 있는 HTML 태그: <p><div><span><br><a><em><strong><del><ins><b><i><u><s><pre><code><cite><blockquote><ul><ol><li><dl><dt><dd><table><tr><td><th><thead><tbody><h1><h2><h3><h4><h5><h6><img><embed><object><param>
web 주소와/이메일 주소를 클릭할 수 있는 링크로 자동으로 바꿉니다.

Textile

다음 태그를 이용하여 소스 코드 구문 강조를 할 수 있습니다: <code>, <blockcode>, <apache>, <applescript>, <autoconf>, <awk>, <bash>, <c>, <cpp>, <css>, <diff>, <drupal5>, <drupal6>, <gdb>, <html>, <html5>, <java>, <javascript>, <ldif>, <lua>, <make>, <mysql>, <perl>, <perl6>, <php>, <pgsql>, <proftpd>, <python>, <reg>, <spec>, <ruby>. 지원하는 태그 형식: <foo>, [foo].
You can use Textile markup to format text.
사용할 수 있는 HTML 태그: <p><div><span><br><a><em><strong><del><ins><b><i><u><s><pre><code><cite><blockquote><ul><ol><li><dl><dt><dd><table><tr><td><th><thead><tbody><h1><h2><h3><h4><h5><h6><img><embed><object><param><hr>

Markdown

다음 태그를 이용하여 소스 코드 구문 강조를 할 수 있습니다: <code>, <blockcode>, <apache>, <applescript>, <autoconf>, <awk>, <bash>, <c>, <cpp>, <css>, <diff>, <drupal5>, <drupal6>, <gdb>, <html>, <html5>, <java>, <javascript>, <ldif>, <lua>, <make>, <mysql>, <perl>, <perl6>, <php>, <pgsql>, <proftpd>, <python>, <reg>, <spec>, <ruby>. 지원하는 태그 형식: <foo>, [foo].
Quick Tips:
- Two or more spaces at a line's end = Line break
- Double returns = Paragraph
- *Single asterisks* or _single underscores_ = Emphasis
- **Double** or __double__ = Strong
- This is [a link](http://the.link.example.com "The optional title text")
For complete details on the Markdown syntax, see the Markdown documentation and Markdown Extra documentation for tables, footnotes, and more.
web 주소와/이메일 주소를 클릭할 수 있는 링크로 자동으로 바꿉니다.
사용할 수 있는 HTML 태그: <p><div><span><br><a><em><strong><del><ins><b><i><u><s><pre><code><cite><blockquote><ul><ol><li><dl><dt><dd><table><tr><td><th><thead><tbody><h1><h2><h3><h4><h5><h6><img><embed><object><param><hr>

Plain text

HTML 태그를 사용할 수 없습니다.
web 주소와/이메일 주소를 클릭할 수 있는 링크로 자동으로 바꿉니다.
줄과 단락은 자동으로 분리됩니다.

CAPTCHA

이것은 자동으로 스팸을 올리는 것을 막기 위해서 제공됩니다.

부 메뉴

안녕하세요 ~ 매트릭스 연산 질문입니다!

테스트가 잘 안되네요.

비슷한 답을 찾았습니다.

http://stackoverflow.com/ques

감사합니다

캐시와 지역성에 관한 내용이네요.

감사합니다

코드 수정했습니다.

요즘 웬만한 컴파일러는 최적화 해주지 않나요? 최적화 옵션을 좀 높이면..

페이지 보시면 loop transform 옵션을 달면

Graphite

+1

학부때 들었던 운영체제 개론이 생각나네요

댓글 달기

Filtered HTML

BBCode

Textile

Markdown

Plain text

주 메뉴

둘러보기

부 메뉴

현재 위치

안녕하세요 ~ 매트릭스 연산 질문입니다!

댓글 달기

Filtered HTML

BBCode

Textile

Markdown

Plain text

주 메뉴

검색 폼

둘러보기

사용자 로그인

Oauth2 Login :